Identification of Degrading Effects in the Operation of Neighboring Photovoltaic Systems in Urban Environments

As photovoltaics technologies have emerged as one of the most promising renewable energy resources in urban environments, monitoring and maintaining of such systems have gained significance. In order to support reliable system operation during the projected in-field operation lifetime, effective strategies for identifying potential problems in photovoltaic systems operation are needed. In this paper, novel methods for the identification of degrading effects in the operation of neighboring photovoltaic systems are presented. The proposed methods are applicable for identifying panel aging properties, soiling effects, and the operation of photovoltaic systems under different shading scenarios. Since the proposed methods are based on the cross-correlation of the operation of neighboring systems, they are particularly suitable performance assessment in urban environments. The proposed identification methods are integrated according to the adopted fog computing model, providing a scalable solution capable of uniform integration into the distributed applications for monitoring and maintenance of photovoltaic systems in urban areas. The details regarding the implementation of the identification methods in the form of data processing services and service operation and dependencies are also provided in this paper. The identification methods, integration concept, and related service operation are verified through the presented case study.


Introduction
In recent years, renewable technologies have become the main factor in the transformation of the global energy market, supporting aspirations for achieving sustainable development and low carbon emissions, while at the same time improving overall population health and worldwide energy access. The transition from climate-damaging fossil fuels towards clean forms of energy has enabled lower carbon dioxide (CO 2 ) emissions, and therefore a reduction of air pollution, especially in urban areas. By the estimations of the International Renewable Energy Agency (IRENA), it is to be expected that CO 2 emission reductions will reach 21% by the year 2050, which could be achieved by deploying rapid solar photovoltaic (PV) technology in combination with deep electrification [1]. Additionally, solar PV has the potential to supply roughly 25% of the global electricity demand by 2050, placing PV systems as the second largest renewable energy resource [1,2]. Solar PV systems are the most promising technologies for utilizing renewable energy resources in urban environments due to their high rooftop installation capacity. Moreover, it has been estimated that the global installed capacity for solar energy production will reach 337 GW in 2030 and 1089 GW in 2050 [3,4]. Consequently, there have been significant efforts from industry and research groups to improve the efficiency, reliability, maintenance, and fault diagnosis capability of PV systems. Recently, PV technology research has been focused on the increase of performance reliability rather than on system efficiency [5]. There have been a number of international initiatives, highlighting the importance of advanced • Methods for PV system performance assessment based on the temporal analysis of PV system performance metric parameters enable automated detection and identification of different degrading effects causing system faults and underperformance; • Despite the heterogeneity of PV systems and their properties, the presented methodology uniformly integrates and utilizes underlying methods used for solar irradiance measurement or assessment; • The proposed performance assessment methodology is seen as a part of automated monitoring and maintenance services and supplementary to periodic on-site inspections; • The adopted tiered architectural style and fog computing approach enable a scalable solution for the seamless integration of large numbers of distributed PV systems in urban environments, which is in line with the requirements of smart city concepts; • Cross-correlation-based services and corresponding system-level parameters enable the discovery of erroneous measurements and sensor and operational faults, improving the reliability of information and the accuracy of performance assessment.
This paper is outlined as follows. In Section 2, a review of the most recent studies and methods for monitoring and assessing PV system performance are provided. The details of the adopted background methodology, focusing on the introduced relevant parameters for PV systems performance assessment, together with a description of services for automated detection of PV faults, are provided in separate subsections in Section 3. The system architecture and deployment process for key services are also given in Section 3. Section 4 provides information on the conducted case study, involving the investigation Electronics 2021, 10, 762 3 of 21 of PV systems in the neighborhood area, followed by analysis of the results. Section 5 includes concluding remarks and the directions for future work.

Related Work
Monitoring the performance of PV systems during their operational lifetimes is a challenging task due to the diversity of the system components, their runtime properties, and a variety of operational system conditions. Therefore, several industries and research groups have made significant efforts to identify the main issues and to define a suitable approach for handling such complex and challenging tasks. This section provides an overview of state-of-the-art concepts and methodologies for detecting and identifying degrading effects that influence photovoltaic system underperformance.
In urban environments, the underperformance of PV operations can be caused by various system-related or environment-related sources. Classified by the origin of the sources, the performance degradation can be related to the physical system properties (connection faults, mechanical defects, aging, manufacturing tolerances, etc.) or can result from operational exposure to shading, soiling effects, or atmospheric conditions [11,12,[15][16][17][18]. In previous studies [19,20], the types and causes of PV systems failures, as well as the different methods proposed in the literature, have been provided.
Recent studies [5,21,22] have intended to provide an overview of the analytic data methods for detection and classification of failures based on acquired performance data, which are used by the research community and industry. The focus of such research is to improve system reliability and longevity through continuous real-time PV monitoring, since the fault detection methods are indispensable to PV system reliability. Furthermore, accurate identification of failures and operational problems in PV systems operation is crucial for reliable power production, minimizing power losses and reducing maintenance costs.
In order to detect faults in operation, previous studies [11,[23][24][25][26][27][28][29][30] have proposed methods based on the comparison of the performance results for a group of PV systems, whereby the estimation of the expected system output is performed using the gathered environmental and weather condition parameters. Comparison-based performance assessments utilize time-series data for an individual PV system power output [11]. The results have shown high dependence of comparison results on particular PV panel operation and weather conditions, as well as their applicability in performance assessment for a group of neighboring PV systems. Similarly, in work presented in [31], the available PV system datasets with power measurements were compared to the reference station outputs, thus allowing the detection of their potential underperformance. These methods are highly dependent on the locations of the observed and referenced stations, particularly on the differences in meteorological conditions at those locations. Beside strict comparisons involving reference systems, certain studies have involved the cross-correlation of operations inside a group of neighboring systems [23]. In order to quantify the correlation results numerically, a novel performance index was introduced. The index parameter, together with the PV system power output, has been used for faults detection. On the other hand, recent studies have introduced automatic fault detection procedures that are not dependent on environmental parameters [14,28,32]. The developed methods do not require any specific monitoring hardware or the input of any operating conditions data, as they only require the energy production data. The development of system performance indicators by comparing the energy production data for neighboring PV systems enables automatic fault detection, in particular for systems located in the neighboring locations, and thus with similar environmental conditions. The study presented in [8] analyzed different aspects that are crucial for the development of effective monitoring systems for small-and medium-scale PV plant applications, such as sensing, acquisition, data storage, and analytics.
An analytical method is used as a basic approach for the detection of PV system underperformance in many studies [11,14,23,[25][26][27][28]30]. In some cases [31,33,34], the proposed methodology is based on the comparison of gathered measurement information from the PV system with the values obtained from an adopted reference model. The detection of Electronics 2021, 10, 762 4 of 21 system underperformance occurs when the predefined difference values are reached. Other studies have proposed alternative methods based on hardware redundancy [8], as well as the combination of standard statistical models with artificial intelligence techniques [35,36], specifically machine learning algorithms [37] and neural network algorithms [38]. The developed fault detection algorithms depend on the variations of the voltage and the power of the PV systems; thus, they are capable of detecting faulty PV modules and different conditions.
As presented, most fault detection methods highly depend on the input parameters derived from the installed PV system and its operating conditions. Common problems that occur and often lead to miscalculations, and therefore to undetected faults, include inaccuracies in the metadata from the PV system, unreliable solar irradiation data, and thermal losses due to different weather conditions [14,39]. In [39][40][41][42], based on the monitoring of several hundred PV systems, it was determined that the fault detection methods used experienced limitations due to unreliable input data. However, the methods based on performance analysis enable automatic, real-time detection, thus optimizing time consumption, with acceptable maintenance costs [5].
On the other hand, more studies have been conducted on PV system fault detection based on imaging techniques [43][44][45][46]. These techniques rely on different types of imaging, including infrared thermography, ultraviolet fluorescence, photoluminescence, and electroluminescence. The focus of such approaches is on the visual assessment of the PV systems, despite the need for additional equipment and time for image processing. The details of the different types of methods for PV system fault detection and their advantages and disadvantages are given in the Table 1. The methods presented in this paper are cross-correlation analysis methods with environmental sensing. However, these methods have improved properties, since they do not depend on the meteoritical conditions. In addition to the overview of the relevant findings regarding the theoretical, analytical, and technological background used in measurements and data processing, our approach also addresses integration-related issues. The integration model applicable in our scenario needs to comply with hierarchically organized data transport supported through layered or tiered architecture capable of integrating different types of end-devices and consumeroriented services. Our approach perceives the distributed energy resources (DER) management and maintenance operations as a part of a more complex distributed system.
There are many studies available in the literature addressing the general challenges regarding the integration and application of renewable energy sources in urban and remote microgrid systems [47][48][49][50][51][52][53][54][55]. Although most studies are mainly focused on energy-related integration issues, some studies [56,57] outline the prospects for other industrial, commercial, and household applications. Control and management techniques and methods applicable for large-scale photovoltaic power stations were identified in [58] as essential. In this regard, the system's ability to support third-party access for custom application development was found to be a requirement of the PV system monitoring equipment in order to support a diversity of applications and services. However, in order to integrate such applications and services, the system-level design must involve a scalable architectural approach capable of handling large-scale data processing under real-time or near-real-time constraints. As an extension of cloud-based computing, the fog computing approach introduces many benefits to these requirements [59]. The idea of fog computing is to add a hierarchy of elements between the core cloud services and network edge devices to meet the challenges regarding real-time data processing, scalability, reliability, security, and high performance in an open and interoperable way [60]. The fog-based approach at the architectural level enables hierarchically organized data transport, integration of locally connected end-devices, data aggregation, large-scale upload capabilities in vertical communication, and horizontal service integration [61].

Methodology and System Overview
The methodology used for the identification of the particular degrading effects of the PV system is based on the cross-correlation of each system's operation estimates within the group of neighboring systems. The Methodology and System Overview section provides the details of the sensor-side data processing and parameter extraction methods, as well as relevant services for correlation-based performance assessments. According to the adopted fog-based reference mode, the details of system-level architecture and the service deployment at the different tiers of the system architecture are also given. Section 3.1 gives insights into the methodology background by introducing the relevant system parameters and related parameter assessment methods. The parameters are used to quantify system operational properties, providing inputs for deployed data processing services. Section 3.2 introduces an identification method for detecting photovoltaic system underperformance under different types of degrading effects, while the details of the system architecture and service deployment are given in Section 3.3.

Methodology Background
The detection of various operational conditions of the PV system and assessment of its operation was based on the analysis of system metric parameters introduced in the previous study [11]. The adoption of the system efficiency factor η SF enabled the compensation of various factors, including panel manufacturing tolerances, measurement imperfections, physical failures, shading operation, panel aging, and different sensor thermal properties, on the assessment of the system operation. In particular, the system efficiency factor was utilized to compensate time-varying operational and physical system properties for each PV panel related to the intersystem comparison under clear-sky conditions. The system efficiency factor is defined as a ratio of the estimated total horizontal irradiance G H to the reference irradiance value G 0 , given in the form of the following equation [11]: The system efficiency factor is highly time-dependent; therefore, the actual value of this factor has to be empirically found from the analysis of the system operation over a considerably long period of time, defined as the observation period (typically given as a one-month period).
Since the assessment of the system operation is adequate only during time intervals where the panel operation is not affected by surrounding objects, further analysis is focused on the corresponding time intervals, defined as the correlation window [11]. The correlation window CW i is defined as a collection of averaging time intervals when the system operation has a potential to be correlated with the operation of other nearby systems, defined by the following equation [11]: where K CW is the determination coefficient for the time interval to be included in the correlation window, G 0 is the reference value of the total horizontal irradiance in the particular time interval, η SFmax is the maximum value of the system efficiency factor in the corresponding observation period, and G Hmax is the daily profile with the maximum estimated interval values found in the overall observation period.
Similarly, considering the execution of cross-correlation services, the corresponding analysis is valid during the adequate cross-correlation window, which is defined for a pair of correlated PV systems as an overlapping time interval found in the correlation windows of both systems, e.g., CW ij . It should be noted that the potential for the execution of cross-correlation services exists only for panels whose cross-correlation window CW ij exists. If CW ij = {}, the operation of corresponding systems i and j are not correlated. For two correlated systems in the cross-correlation time interval, it is possible to calculate the estimation difference factor ED ij , which is given as a difference between the individual estimation difference factors ED i and ED j , defined as [11]: The detection of the panel fault operation occurs when the estimation difference factor ED ij becomes greater than the boundary fault detection coefficient K FD . To summarize, the system metric parameters η SF i , CW i , CW ij , and ED ij are used as the inputs for the methods that identify the cause of the PV underperformance, whereas the duration of the observation period and coefficients K CW and K FD are system configuration parameters. Additionally, system metric parameters present an outcome of the data aggregation process, whereby the gathered sensor data are transformed into a summarized form for the associated averaging interval. These parameters provide sufficient informativity for further data analysis related to the developed identification methods.

Identification Methods
It should be noted that the initial value of the system efficiency factor η SF is available after the initial observation period. After that period, the value of η SF is updated daily, based on gathered measurements in the time window defined as the observation period.
During regular system operation after commissioning, estimation of system performance is valid only during the detected system operation period under clear-sky conditions. The clear-sky conditions are identified through the expression that defines the boundaries of such system operations [11]: where K CS is also a configuration parameter. If the clear-sky condition is found in the group of correlated PV systems, execution of related cross-correlation services for the observed group of panels is assumed to produce valid results, regardless of whether the clear-sky condition is confirmed for all panels in the group.
During fully shaded or partially shaded panel operation, under the detected clear-sky conditions, PV system i is expected to produce less power, causing a significant increase of all ED ij parameters in the group of correlated panels. For on-surface obstacles, the effect is expected to be found in all time intervals included in a particular CW ij , since ED ij are calculated for each time interval. On the other hand, for off-surface obstacles, the effect is expected to be found in some of the time intervals included in the CW ij , since the panel shading depends on the incident angle of sunlight.
In order to give methods in the form of analytical expressions for detection of panel fault operation with on-surface and off-surface obstacles, we adopted the following notation, whereby ED ij (t, d) represents the estimation difference factor for correlated panels i and j for a time interval t for a particular day d. One should keep in mind that each of the methods provide valid outputs only for time intervals belonging to the cross-correlation window, i.e., for intervals t ∈ CW ij .
In the space of ED ij (t, d) estimates for a particular time interval t during several days of panel operation, the fault operation of panel i, whose operation is correlated with the operation of other panels j = 1..k, is defined as follows: It is important to highlight that the result for the comparison given with (5) is valid only for time intervals defined with the tuple (t, d) where condition (4) is satisfied. If condition (5) is fulfilled for all time intervals from the cross-correlation window, the degrading effect is the consequence of an on-surface obstacle. The analytical expression used to evaluate such a condition can be expressed as: For detection of off-surface obstacles, the panel operation does not exhibit the behaviors outlined by expressions (5) and (6), but rather by expressions (7) and (8), given with: As one can notice, identification of off-surface obstacles is achieved by meeting condition (5) in a certain number of time intervals, e.g., t 1 , t 2 , t 3 , ..., t m ∈ CW i , which belong to the corresponding cross-correlation windows. Since the same PV system operation behavior is expected to be found on the following days, the detected panel underperformance can be verified in the same time intervals t found on consecutive days, all in clear-sky conditions.
The evaluation of conditions (4) to (6) on one hand, as well as conditions (4), (7), and (8) on the other hand, is part of the fault detection services, while extraction of cross-correlation parameters is part of the (cross-)correlation services. It should be noted that the execution of both service types assumes the availability of time series data, including information for maximal horizontal irradiance values G Hmax with the corresponding date and time stamps, time series of daily estimates of η SF i , and interval-related ED i values, as part of the shared data resource repository. Soiling effects, in particular short-term soiling effects, can be detected through linear regression analysis of a time-series of estimated difference ED i (t, d j ) values for a particular time interval t found on consecutive days, all with the detected clear-sky conditions. If input time series data are given as ED i (t,d 0 ), ED i (t,d 1 ), ED i (t,d 2 ), . . . ED i (t,d n-1 ), where n is the number of days in observation, a slope b of the linear regression model is found from the equation: Since the panel soiling produces a constant effect on panel underperformance, the slope of the linear regression model is expected to be correlated for all N time intervals found in the panel correlation window. Therefore, the detection of the soiling effect in the panel operation occurs if the following expression holds: stands for the standard deviation value and K µ and K σ are coefficients defining the sensitivity of the proposed method. Long-term soiling effects are related to the panel underperformance operation in the time interval comparable with the duration of the observation period. Such a long period of exposure to soiling influences both η SF i (d) and ED i (t,d) panel metric parameters, and therefore is detectable by time series trend analysis, as given with Equations (9) and (10). Since the decrease of η SF i (d) given by (1) slightly compensates the increase of ED i (t,d), given with (3), applying such a detection method is highly sensitive to the particular panel operation and soiling properties.
The aging properties of the panel operation can be easily monitored, since the inherent property of the methodology for the estimation of the system efficiency factor η SF i is that η SF i compensates aging and other long-term effects with a duration longer than the observation period. The aging effect is directly observable through the changes of the multiannual profiles of the η SF i (d, y) parameter using the linear regression method, where η SF i (d, y) stands for the η SF i value in a particular day d of year y. In order to avoid seasonal interference, it is necessary to observe the series of η SF values, where day d corresponds to a particular day or a part of the season.

System Architecture and Service Deployment
The proposed identification methods as a part of the overall performance assessment methodology are seen as a part of the larger-scale distributed application for PV systems performance assessment and automated detection of its fault operation. In order to support such an application with high data processing requirements under real-time or near-realtime constraints, the design of the distributed system components must support a scalable architectural approach. As an extension of the cloud-only solution, the fog computing approach provides the support necessary to fulfil this requirement at the architectural level. Following the given application context, the details of the fog-based architecture of the PV systems operation assessment and the details of the deployment of methodology-related processing services are shown in Figure 1. Parameter assessment services in tier 2 govern the process for calculating system efficiency parameters based on the reference irradiance value G , which are locally stored at the application support service layer in tier 2, together with decisive coefficients K , K , Kμ, and Kσ, as presented in Equations (1) and (2). As a result of these services, the corresponding system efficiency factor η and correlation window CW for each PV system are found. The following (cross-)correlation services calculate the estimation The number of fog nodes at a particular tier corresponds to the geolocational distribution and density of the deployed measurement infrastructure. The nodes' functionality varies based on their positions in the particular tier in this N-tier architectural model. Nodes in tier 1 collect measurement data from the connected sensors and PV panel devices. They also perform data preprocessing operations, including sampling, filtering, and formatting followed by data aggregation, and communicate with other nodes that are included in the particular cross-fog application. The fog nodes in tier 2 execute several application services, providing information about the system operation status, data analysis, event notification, detection, and identification of end-device fault operations. In tier 3, the fog nodes combine information obtained from tier 2 fog nodes covering a wider district or metropolitan area. Since the nodes that are closer to the network core services located in the cloud have higher processing and communication capabilities, they perform more complex analytics to provide more sophisticated services related to the operation of the overall network of PV systems in the district area. However, certain fog nodes are not directly related to PV system operation monitoring, so they can be part of other IoT applications and systems.
Physical deployment is organized according to the tier architecture style. The utilized data processing methods are encapsulated as service components that reside in the different tiers of the system architecture. The bottom tier of the presented architecture contains connected end-devices that provide measurement data for PV panel properties (I PV , V PV , T PV ), environmental temperature, and other relevant atmospheric conditions (relative humidity, air pressure, etc.) embedded in dataset 1.
Collected raw sensor data are preprocessed by services that reside in the application service layer in tier 1, providing an estimate of the total horizontal irradiance G H available for each averaging time interval. The required configuration parameters (β TC , η r , A, β, γ, ϕ) utilized by the preprocessing services are part of the configuration data structure in the application support service layer. The sequel aggregation services in tier 1 perform a series of operations to calculate the daily profile with the maximum estimated interval values founded in the overall observation period G Hmax using the obtained G H data and observation period parameter written in the configuration data structure. Both G Hmax and G H data are a part of dataset 2, and they are forwarded to tier 2 for further processing.
Parameter assessment services in tier 2 govern the process for calculating system efficiency parameters based on the reference irradiance value G 0 , which are locally stored at the application support service layer in tier 2, together with decisive coefficients K CW , K FD , K µ , and K σ , as presented in Equations (1) and (2). As a result of these services, the corresponding system efficiency factor η SFi and correlation window CW i for each PV system are found. The following (cross-)correlation services calculate the estimation difference parameter (ED ij ) according to Equation (3) for each pair of correlated systems in the matching cross-correlation window CW ij .
The results of the correlation processing services are forwarded to the fault detection services in the form of a dataset, which includes the system metric parameters (η SF i , CW i , CW ij , ED i , ED ij ). There are three of these services that identify each class of degrading effects, namely on-surface FD1, off-surface FD2, and soiling FD3, while their outputs are operational statuses of the individual PV systems in the form of detected regular PV system operation (pass status) or irregular operation (fail status). If clear-sky condition is not detected, the service output is given as skip status regardless of the actual estimation difference value. Fault detection service FD1 determines if an on-surface fault has occurred by comparing the values of the estimation difference parameter (ED ij ) with the boundary fault detection coefficient K FD . According to the implemented method previously described in Section 3.2, the FD1 service first executes the clear-sky condition identification, which if positive is followed by each PV system assessment for all identified ED ij values for every time interval in the matching cross-correlation window CW ij , analytically given by condition (6). If this condition is fulfilled, the FD1 service generates a failed status; if not, the status is regular.
Fault detection service FD2 is responsible for the detection of off-surface degradation effects, preconditioned by the same clear-sky condition identification as in the FD1 service.
Comparing both values of individual estimation difference parameter (ED i ) for a single PV system in a matching correlation window interval CW i and every estimation difference parameter (ED ij ) for a pair of correlated PV systems in the matching cross-correlation window interval CW ij with boundary fault detection coefficient K FD , it is possible to determine the fault operation, which is analytically given by conditions (7) and (8). In order to generate a failed status for the FD2 service, it is necessary that in a certain number m out of N time intervals, the values of (ED i ) and (ED ij ) are over the boundary fault detection level, whilst there is at least one time interval i from corresponding correlation and cross-correlation windows where compared values are under the fault detection level. The value of m is adopted as a number comparable to the ratio of the cross-correlation window interval to the duration of time interval i, in order to avoid possible mismatch in ED values at the boundaries of the cross-correlation window. It is to be expected that the same behavior of the PV system operation would be found in the same time intervals in the following clear-sky days; therefore, the detected panel underperformance operation can be verified and tracked.
Fault detection service FD3 enables detection of the short-term soiling effect, long-term soiling effect, and aging effect in PV systems operation. Following the clear-sky condition identification from the FD1 service, the FD3 service executes a set of operations in order to calculate the slope b of the linear regression model from the input time-series data of the estimated difference ED i (t,d j ) for each time interval in the matching correlation window interval CW i , analytically given by condition (10). By choosing adequate coefficients K µ and K σ , it is possible to define the sensitivity of the FD3 service; therefore, the boundary values for the FD3 service will report the fault. By combining the current day status and the previously stored historical data for short-term soiling effect statuses, it is possible to identify the long-term soiling effects. The additional verification is the significant decrease of η SF i (d) for the long-term effects in the time interval comparable with the duration of the observation period. Moreover, by using the previously described method of linear regression, the FD3 service executes a set of operations in order to calculate the slope b of the linear regression model from the input time-series data of the system efficiency factor η SF i (d, y) for each time interval in the matching correlation window interval CW i for the η SF i value in a particular day d of year y in order to identify the aging effect.
As the fault detection service outputs are operational statuses of the individual PV systems in the form of regular or failed statuses, they are embedded in dataset 3 and forwarded to the more complex analytics services in tier 3 fog nodes. Further details of particular service operations for tier 3 nodes are outside of the scope of this paper.

Case Study
Deployment details for the investigated neighboring PV systems are given in Figure 2. For the observed geographical area, the representative fog node in tier 2 is fog node N 21 . Fog node N 21 controls three PV sensor systems in tier 1, installed at separate locations on the rooftops of the university buildings and associated research facilities. PV panels P 1 and P 2 , together with environmental sensors S 1 and S 2 , are connected to fog node N 11 , located at the ES (Embedded Systems) lab at the Technical Faculty Pavilion. Similarly, PV panel P 3 and temperature sensor S 3 are located at the ICEF (Innovation center of School of Electrical Engineering) research laboratory and integrated into the fog architecture through fog node N 12 , as PV panels P 4 and P 5 and temperature sensor S 4 , located at the faculty main building, are connected to fog node N 13 . It is not mandatory that node N 21 in tier 2 resides in the same neighborhood area as node N 1X from tier 1, since the node does not perform actual measurements, but rather processes data obtained from connected fog nodes in tier 1. Higher-level services and further integration toward core services are supported through the infrastructure node N 31 .
The deployment scenario given in Figure 2 is selected as a representative scenario in an urban surrounding, since it consists of PV panels from different manufacturers and with different electrical and temperature characteristics. The case study presented in this section includes the performance assessment with the automatic detection of the faults of the five correlated PV panels, which were installed in close proximity while operating under different environmental conditions. The features of the PV panels used for this case study are provided in the Table 2.
Engineering) research laboratory and integrated into the fog architecture through fog node N12, as PV panels P4 and P5 and temperature sensor S4, located at the faculty main building, are connected to fog node N13. It is not mandatory that node N21 in tier 2 resides in the same neighborhood area as node N1X from tier 1, since the node does not perform actual measurements, but rather processes data obtained from connected fog nodes in tier 1. Higher-level services and further integration toward core services are supported through the infrastructure node N31. The deployment scenario given in Figure 2 is selected as a representative scenario in an urban surrounding, since it consists of PV panels from different manufacturers and with different electrical and temperature characteristics. The case study presented in this section includes the performance assessment with the automatic detection of the faults of the five correlated PV panels, which were installed in close proximity while operating under different environmental conditions. The features of the PV panels used for this case study are provided in the Table 2.    The experiment was conducted over 70 days, from August until October of 2020, of which the first 20 days were used as the observation period. During the total period, the averaged horizontal irradiance values G H were gathered for all the panels, whereas the daily profile of reference irradiance G 0 was obtained from the ASHRAE clear-sky model [62]. Considering everything stated above in Section 3.1, system parameters G Hmax , η SF , η SFmax , CW i , CW ij , ED i , and ED ij for all panels and correlation pairs were calculated. The PV panel current and voltage signal sampling intervals were set to 1 s, while the averaging time interval was set to 10 min. The fault detection coefficient was set to K FD = 0.2, while the clear-sky coefficient was set to K CS = 0.85. Since PV panels P1 and P2, as well as P4 and P5, are identical devices and installed right next to each other, which would make their individual fault detection much easier than any random correlated pair in the urban environment, the three different types of defects were separately induced to the individually installed PV panel 3. The three experimental scenarios were conducted as follows.

Detection of the On-Surface Degrading Effect
In order to verify the presented method for detection of the on-surface degrading effect, the following experiment was performed. On day 41 of the experiment at 06:00, part of the surface of PV panel 3 was intentionally covered. The relevant data for PV panel 3 for five consecutive days, from day 38 to day 42, are presented in Figure 3.
confirmed for days, 41 and 42, since the normalized value of the estimated total horizontal irradiance obtained from panel P4 was above the border curve (DET status), regardless of the fact that the matching value from panel P3 was significantly below the boundary value (NOT detected status).  It is necessary to point out that the system efficiency factor is expected to show gradual change during the succeeding observation period. Since the duration of the observation period is much longer than the daily period, it will not directly affect the outcome of the cross-correlation analysis.
As seen from the analysis, on-surface shading effects are successfully identifiable through the time-series analysis of the performance metric parameters ( , ), where For the five days of interest, the daily profiles of the estimated total horizontal irradiance G H (blue), its maximum normalized value G Hmax /η SFmax (purple), its reference irradiance value G 0 (red), and its boundary value for clear-sky conditions (green) for PV panel 3 are presented in Figure 3a, whereas the corresponding values for PV panel 4 are presented in Figure 3b. The corresponding values for PV panel 2 showed a similar trend as the values for PV panel 4, and therefore were not presented in a separate graph.
On day 41 of the experiment, after the clear-sky conditions given by condition (4) were confirmed and the initial time intervals with no irradiance or negligible low levels of irradiance (night-time intervals and the time intervals around the sunrise and sunset) were excluded from the assessment, the correlation windows for each PV panel (CW 2 , CW 3 , and CW 4 ) were determined using condition (2). The maximum values of the system efficiency factor, as given in Equation (1), in the corresponding observation period for all three panels were found to be η SF2max = 0.65, η SF3max = 0.63, and η SF4max = 0.66, respectively. Figure 3c,d present the determined correlation windows using condition (2), together with cross-correlation windows for the correlated pair P3-P4 and P3-P2. Information about the clear-sky conditions, extracted from Equation (4) and given as DET or NOT condition, is also given in Figure 3c,d. One should keep in mind that if the normalized value of the total horizontal irradiance for a single panel or both of the panels is above the boundary value given in Equation (4), the result of the cross correlation, as evaluated using Equation (5), is considered as valid.
For the correlated pairs in the cross-correlation time interval, the estimation difference factors ED 34 and ED 32 were calculated, as provided in Equation (3). The values of the estimation difference factor for PV panels P3 and P4 (ED 34 ) are presented in Figure 3e, while similar values for PV panels P3 and P2 (ED 32 ) are presented in Figure 3f. Information about the validity of the analysis, given in the form of PASS and SKIP notes, is also given in Figure 3e,f. A SKIP note suggests that the clear-sky condition is not detected for neither of the correlated panels in any all-time intervals from the cross-correlation window.
At day 38, it can be noticed that the clear-sky condition is not detected for either panel in the group of the correlated panels, therefore the correlation-related processing produced an invalid result, indicated as the SKIP status in Figure 3e,f. It can be noticed that during days 39 and day 40, all of the ED 34 and ED 32 values in time intervals from CW 34 and CW 32 respectively, were below the value of the fault detection coefficient K FD ; therefore, conditions (5) and (6) were been fulfilled. We can conclude that the operation of the panels does not demonstrate on-surface degrading effects, since the validity of the analyses is confirmed through the PASS status. On the other hand, both ED 34 and ED 32 values during the cross-correlation windows on days 41 and 42 were higher than the value of fault detection coefficient K FD , due to the increased ED 3 values. For day 41, conditions (5) and (6) were fulfilled; therefore, the on-surface degrading effect was confirmed and indicated as a FAIL status. The validity check regarding the clear-sky condition was confirmed for days, 41 and 42, since the normalized value of the estimated total horizontal irradiance obtained from panel P4 was above the border curve (DET status), regardless of the fact that the matching value from panel P3 was significantly below the boundary value (NOT detected status).
It is necessary to point out that the system efficiency factor is expected to show gradual change during the succeeding observation period. Since the duration of the observation period is much longer than the daily period, it will not directly affect the outcome of the cross-correlation analysis.
As seen from the analysis, on-surface shading effects are successfully identifiable through the time-series analysis of the performance metric parameters (ED, η SFmax ), where in the case of physical failure and a partially or fully covered panel surface, the estimation difference value is uniformly degraded in all aggregation time intervals in cross-correlation windows. The output of the proposed identification method for detecting on-surface degrading effects is given in the form of the PASS and FAIL statuses. False detections, under fully shaded or partially shaded operation conditions, are avoided, since the results of the analysis are taken as valid only during the detected clear-sky atmospheric conditions for the group of correlated panels. Detection of faulty PV operation as an outcome of fault detection services in tier 2 leads to further consumer notifications and maintenance alarms as a part of the notification services at higher tiers of the reference fog computingbased architecture.

Detection of the Off-Surface Degrading Effect
In order to verify the presented method for detection of off-surface degrading effects, on day 25 of the experiment at 06:00, the obstacle was placed in front of PV panel 3 at a particular distance. This resulted in panel operation under the partially shaded conditions. As in Figure 3a,b, the relevant daily profiles of panels P3 and P4 are presented for the five days of interest, from day 22 to day 26. Similarly to the previous scenario, the clear-sky conditions, designated as detected (DET status) or not-detected (NOT status), were confirmed using condition (4), while the correlation windows (CW 2 , CW 3 , and CW 4 ) and corresponding cross-correlation windows CW 32 and CW 34 were evaluated from Equation (2) and are presented in Figure 4c,d. Furthermore, the values for estimation difference factors ED 34 and ED 32 were calculated as in expression (3) and are presented in Figure 4e,f.
It is noticeable that during day 22, some cloudiness was present, overlapping with the correlation intervals, therefore indicating that the correlation output was not valid (notified as the SKIP status in Figure 4e,f).
On day 23, there were alternating periods of cloudy and clear-sky weather conditions, designated as the NOT and DET statuses, respectively, on a daily basis. However, each averaging time interval (10-min) has its own DET or NOT status; thus, it is possible to execute the detection procedure even on the days with low numbers of clear-sky intervals. The spikes during day 22, which can be found in ED 34 and ED 32 , as shown in Figure 4e,f, correspond to the situation where all panels in the group operate under the cloudy weather conditions, indicated as the NOT status in the particular averaging interval. Therefore, the results of the correlation procedure for the particular averaging interval were found to be invalid, as indicated with the SKIP status. Since the granularity of the presented results in Figure 4e,f is on a daily basis, the SKIP and PASS statuses in different averaging intervals during day 23 are shown as a combination of both statuses specified as SKIP or PASS. As the values of ED 34 and ED 32 in all averaging intervals with the PASS status were below the adopted value K FD , according to condition (7), the irregular operation of the PV panels was been detected during day 23. Similarly, on day 24, as clear-sky conditions were confirmed (DET status), all of the values for ED 34 and ED 32 were below the adopted value K FD ; thus, none of the irregular operations of the PV panels were detected.
On day 25, it can be noticed that for certain intervals of both cross-correlation windows CW 32 and CW 34 , values of both ED 34 and ED 32 were higher than the values of the fault detection coefficient K FD , thereby satisfying condition (7), whilst there was at least one time interval where values of ED 34 and ED 32 were below the values of coefficient K FD , satisfying condition (8). Since both conditions (7) and (8) were satisfied, the off-surface degrading effect was confirmed. A similar situation was detected for panel P3 operation during day 26.
As noticed from the previous analysis, fully shaded or partially shaded operation areas of the PV panel caused by the nearby obstacles that cast a shadow on the panel surfaces are identifiable through the cross-correlation analysis of the performance metrics parameter ED inside the cross-correlation window. Degraded panel operation, identified as an estimated difference above the boundary value, is expected to be found for some, but not for all, of the averaging time intervals from the cross-correlation window, as given through expressions (7) and (8). In contrast to the panel operation with on-surface obstacles, operation with offsurface obstacles does not significantly affect the value system efficiency factor, since there are averaging time intervals inside the cross-correlation window, where the PV panel regularly operates without shadows being cast on the panel surface. Similar to the detection of onsurface obstacles, detection of off-surface obstacles produces valid results only during the

Detection of Soiling Degradation Effect
In order to verify the presented method for the detection of the soiling degradation effect, the following scenario was performed. On each day, starting from day 50 of the experiment, the surfaces of PV panels P2 and P4 were cleaned, whereas the surface of PV panel P3 was exposed to environmental accumulation of dust and dirt.
As previously stated, after the clear-sky conditions were confirmed and the correlation windows were determined, the estimation difference values for each panel (ED 2 , ED 3 , and ED 4 ) were calculated, using expressions (4), (2), and (3), respectively. As stated in Section 3.2, linear regression analysis was used, as the parameter b, the slope of the model, was determined using expression (9). The input data for the linear regression analysis were time-series of estimated difference ED 3 values for time intervals t 1 = 12:30-12:40; t 2 = 12:40-12:50; t 3 = 12:50-13:00; t 4 = 13:00-13:10 for a period of ten consecutive days, only involving detected clear-sky condition. The results are presented in Figure 5 for t 1 , t 2 , t 3 , and t 4 . Furthermore, the mean value for the set of b(t) and standard deviation values were extracted from the obtained data. In order to detect the soiling effect in panel operation P3, as provided in the expression (10), the coefficients K µ and K σ , used to define the sensitivity of the proposed method, need to be empirically determined from the previous behavior of PV panel P3. Additionally, the slopes of models obtained from other correlated panels need to be considered in order to avoid seasonal interference. For the conducted experiment, the mean value for the set of b(t 1 -t 4 ) was found as µ b = 0.42%/day, while for standard deviation σ b = 0.043%/day. By considering the data from the previous observation periods, these results showed significant increases due to the observed soiling effects in the operation of panel P3. Additionally, the mean value and standard deviation for the set of b(t 1 -t 4 ), obtained from the cross-correlation analysis for panels P2 and P4, whose surfaces had been cleaned daily, were µ b = 0.18 %/day, σ b = 0.015 %/day, respectively.
According to the data provided in Figure 5, it can be noticed that some of the data points-namely the ones corresponding to days 52, 53, and 57-are missing, since the clear-sky condition was been detected in the observed averaging intervals t 1 -t 4 . The individual clear-sky statuses for averaging intervals t 1 -t 4 , together with the slope of the linear regression model during the experiment, are provided in Table 3. As seen from the provided case study, short-term soiling produces cause effects similar to the fully shaded panel operation scenario, but with significantly lower degradation rates of the estimated difference parameter. These effects are successfully detectable through the trend analysis of the time series of estimated difference values for the particular averaging interval for a series of consecutive days, or in a time interval shorter than the duration of the observation interval. On the other hand, the effects of long-term soiling on estimation difference values are partially compensated. This is consequence of the time interval in cases of slowly changing dust accumulation is comparable with the duration of the observation window. Thus, changes in the estimated difference ED ij are compensated through the variation of the system efficiency factor η SF of the panel exposed to soiling effects. In such a scenario, the degrading effects are identifiable through the trend analysis of the variation of the system efficiency factor itself, since the value of the estimation difference parameter obtained from the cross-correlation analysis is not affected. According to the data provided in Figure 5, it can be noticed that some of the data points-namely the ones corresponding to days 52, 53, and 57-are missing, since the clear-sky condition was been detected in the observed averaging intervals t1-t4. The individual clear-sky statuses for averaging intervals t1-t4, together with the slope of the linear regression model during the experiment, are provided in Table 3. As seen from the provided case study, short-term soiling produces cause effects similar to the fully shaded panel operation scenario, but with significantly lower degradation rates of the estimated difference parameter. These effects are successfully detectable through the trend analysis of the time series of estimated difference values for The experimental analysis given in Section 4 confirmed that the different degrading effects, which result in PV system underperformance, are identifiable through temporal analysis of the system performance metric parameters and the proposed cross-correlation analysis. Additionally, the adopted fog computing model is found to be highly suitable for the implementation of the proposed identification services in the form of sequential processing steps, since it enables the distribution of computing, storage, and networking steps in a hierarchically organized form. Furthermore, as its inherent characteristic, the adopted fog-based architecture is capable of overcoming the limitations of the current infrastructure and supports the design of more comprehensive large-scale applications for monitoring and maintenance of photovoltaic systems in urban environments.

Conclusions
The system-to-system comparison among the group of correlated PV systems represents a flexible framework for robust identification of degrading effects in the operation of neighboring PV systems. The proposed identification methods are based on the analysis of the temporal properties of the performance metric parameters under different operational conditions, including panel aging, mid-term soiling, and on-surface and off-surface shading operations. Since the methodology provides valid estimates exclusively under the empirically detected clear-sky conditions, the effects of varying irradiance found under the partially shaded weather conditions are omitted. Considering that the set of identification methods is given in the form of sequential processing, whereby sensor-side data processing can be localized at the network edge, the implementation of identification services according to the tiered model is particularly suitable. Furthermore, according to the adopted fog computing reference model, the proposed hierarchically organized architectural solution enables integration of PV system monitoring and maintenance services in the form of a distributed application capable of acquiring large-scale data and providing various enduser services. The utilization of more advance statistical methods and machine learning algorithms for cross-correlation analysis, as well as the extension of the identifiable fault types and the detection of multiple faults, are planned as parts of a future study.
Funding: This research was funded by Ministry of Education, Science, and Technological Development of the Republic of Serbia.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.