The first step in numerical weather prediction (NWP) is to have an accurate estimation of the atmospheric state to initialize the NWP model. Probably the most challenging aspect in this step is to estimate the characteristics of the cloud field: location of the clouds and their vertical extent, as well as the type/content of the hydrometeors. The challenge arises from either inaccuracies in the observations or inadequate spatio-temporal sampling for NWP applications. The problem is aggravated for nowcasting applications wherein the latency of the data could become an issue. A clear example is nowcasting surface irradiance for solar energy applications e.g., [1
] wherein misplacement of clouds during the initialization can introduce large errors even in the very short term predictions.
An attractive option to initialize the cloud field in nowcasting applications is to use retrievals from geostationary satellites. The retrievals from these satellites have low latencies (minutes) and adequate spatio-temporal samplings for NWP applications. For instance, the Advanced Baseline Imager ABI, [3
] on board of the GOES-R satellites provides full disk scans every 10 minutes at a resolution ranging from 0.5 to 2 km depending on the spectral band. In addition, the spectral resolution is now comparable with the one from instruments on board of circumpolar satellites such as the Moderate Resolution Imaging Spectroradiometer (MODIS) or the Visible Infrared Imaging Radiometer Suite (VIIRS) which should translate into more accurate retrievals of the cloud variables than the previous GOES series.
The cloud retrieval process starts with the determination of the cloud mask. Approaches based on statistical models and threshold based models have been proposed e.g., [4
]. The operational cloud mask algorithms use threshold-based algorithms. The algorithms use a combination of spatial, temporal and spectral tests together with a set of thresholds to discern cloudy pixels from clear sky ones e.g., [6
]. For example, the ABI Clear sky Mask (ACM) product uses a combination of the aforementioned tests with thresholds selected in such a way that no more than 2% of cloud over-detection (i.e., clear sky incorrectly classified as cloudy) is allowed [14
In this study we quantify the performance of the GOES-R ACM product from the satellite located in the GOES East position, GOES-16, to detect clouds over the contiguous U.S. (CONUS). The main motivation is to assess the potential of this cloud mask product to initialize clouds in a NWP model designed for solar irradiance nowcasting called MAD-WRF. MAD-WRF combines the strengths of two models, MADCast [15
] and WRF-Solar [16
]. MADCast retrieves the cloud fraction from satellite infrared chanels and advects and diffuses this cloud fraction as a tracer using a modified version of the Weather Research and Forecasting model WRF, [17
]. WRF-Solar is an augmented model specifically designed for solar energy applications and the main developments have focused on enhancing the aerosol-cloud-radiation interactions. The strength of MAD-WRF relies on the combination of the initialization process, wherein cloud retrievals can be assimilated, and the advection and diffusion of the initial hydrometeors as tracers to nudge the resolved hydrometeors at the beginning of the simulation, and, in this way, try to develop an appropriate environment to support the initial cloud field. Hence, an accurate cloud initialization is crucial for this particular application, and it is the motivation to assess the performance of the GOES-R ACM product.
The assessment is performed by comparing two years of the GOES-16 ACM retrievals interpolated to a 9 km grid covering the contiguous U.S. against the cloud mask retrievals from the Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observation CALIPSO, [18
]. CALIPSO is equipped with an active sensor that to our knowledge provides the most accurate cloud detections covering CONUS during the multi-year period of analysis. The 9-km grid is well suited for nowcasting applications with MAD-WRF since it provides low latencies in production runs.
The multi-year assessment presented herein provides the most comprehensive evaluation of the ACM product over CONUS to date. The previous studies assessing the ACM performance that we are aware of have used also CALIPSO retrievals, but the studies have focused on a different region [14
], or provided a basic evaluation of the product performance [5
]. The algorithm theoretical basis [14
] quantifies the performance of the ACM algorithm using retrievals from the Spinning Enhanced Visible and Infrared Imager (SEVIRI) on board of the Meteosat. Hence, the study does not include any retrieval over the CONUS. The assessment used eight weeks of data distributed over the four seasons and found a probability of detection, or accuracy (numer of correct retrievals divided by the number of retrievals), for all sky conditions (clear or cloudy) of 91.4%. The other study [5
], focused on the development of an alternative retrieval algorithm based on statistical modeling, and only provides a basic evaluation of the ACM product as a baseline for comparison. Using about two months of data in each season, the study found an accuracy under all sky conditions of 85.7% with higher accuracy for cloud detection, 93.0%, than for clear sky detection, 67.7%. The study used the same grid under consideration herein, and thus the present work can be considered to a certain extent as an extension of the evaluation presented in [5
]. The novel aspects of this study include a better pairing of GOES-16 and CALIPSO retrievals, and the use of a multi-year evaluation period. The multi-year period allows for a statistically robust characterization of the performance during daytime/nighttime using either the complete dataset or subsets for the four seasons for the first time. Additionally, using a multi-year period provides sufficient data to quantify the performance on spatial plots, and, to the best of our knowledge, this is the first time that the ACM performance have been characterized on two dimensional maps. Hence, we are able to identify the periods and regions wherein the ACM product shows its largest potential to initialize the clouds in the MAD-WRF model.
The manuscript is organized as follows. Section 2
describes the GOES-16 and CALIPSO retrievals as well as the spatio-temporal matching of both datasets. Section 3
presents the results, and the discussion is presented in Section 4
The accuracy of the ACM product for all sky conditions calculated with the two years of retrievals during 2018 and 2019 is 86.0%. However, the accuracy for the cloud detection is higher, 90.9%, than the clear sky detection, 74.8%. To assess the impact of the temporal matching of CALIPSO and ACM retrievals, the set of accuracies as a function of a time offset are shown in Figure 3
. As expected, the curves show the maximum when no offset is introduced. When the temporal offset is introduced, the accuracy decays at a faster pace for the clear sky detection than for the cloud detection. For instance, for a 1 h time offset the accuracy of the cloud detection drops to 88.0% (2.3% reduction) whereas for the clear sky drops to 69.0% (7.7% reduction). This is a consequence of having more clouds (950350, 69.3%) than clear sky (420662, 30.7%) retrievals and probably an over-detection of clouds by the ACM product.
The amount of clouds in our grid is modulated by the cloud fraction threshold used to create the binary cloud mask from the cloud fraction calculated with the ACM product. This threshold has been set to zero, and there is not a solid theoretical basis to change it, but analyzing sensitivities to this threshold will help us understand the performance of the ACM product. With this aim, the accuracy as a function of the cloud fraction threshold is shown in Figure 4
a. As expected, the accuracy of the cloud detection decreases as the cloud fraction threshold increases whereas the opposite is true for the clear sky detection. The accuracy for all sky conditions shows the maximum for a zero cloud fraction threshold which is also the expected behavior (and in agreement with [5
]). However, the behavior of the accuracy metrics differs for daytime and nighttime (Figure 4
b,c). During daytime (Figure 4
b), the discrepancies already noticed in the cloud and clear sky detections are exacerbated. For a zero cloud fraction threshold, the accuracy for cloud detection is 95.9% whereas clear sky detection is 66.6%. Interestingly, the performance of all sky conditions does not show the maximum for a zero cloud threshold (87.2%) but at 0.20 (87.6%), but the curve is rather flat. The high (low) accuracy for cloud (clear sky) conditions suggests a cloud over-detection during daytime. On the contrary, nighttime shows similar accuracy for cloud and clear sky detections (85.8% and 82.5%, respectively) for a zero cloud fraction threshold (Figure 4
c). In addition, the maximum of the accuracy for all sky conditions is at a zero cloud fraction (84.7%).
shows the spatial distribution of the accuracy. Results for each grid cell are calculated with the matched pairs of ACM and CALIPSO cloud mask over a region of 15 by 15 grid points. This smoothing is necessary since, despite using two years of data, the spatial density is not sufficient to display results at 9 km of grid spacing. The accuracy for all-sky conditions is higher for the eastern part of the CONUS wherein the accuracy is larger than 90% (Figure 5
a). The western portion of the CONUS shows lower values but never smaller than 70%. As already pointed out, the accuracy is larger for the cloud detection (Figure 5
b) than for the clear sky detection (Figure 5
c). More specifically, the accuracy is larger than 90% almost everywhere in the CONUS, being larger than 95% in certain regions. On the other hand, the clear sky accuracy shows the largest values in the south of the CONUS, larger than 85%, but some regions in the north show accuracies close to 50%. Low accuracy values over the Rocky Mountains can also be appreciated. In spite of this different behavior between clear skies and cloudy skies, the higher frequency of the later leads to an all-sky accuracy more influenced by the results of the cloud detection. A skill score that takes into account the frequency of both populations is the Kupier’s skill score (KSS). KSS is defined as the difference between the hit rate and the false alarm rate and it ranges from −1 to 1 with 0 indicting no skill. The distribution of the KSS is shown in Figure 5
. Over the CONUS, the pattern is more similar to the clear sky accuracy (Figure 5
c) since the accuracy of the cloud detection is very high (Figure 5
b). The KSS shows values larger than 0.8 in the south and lower values in the north that can reach down to 0.4.
The evolution of the accuracy over the two year period is shown in Figure 6
that shows the box and whiskers plots of the daily accuracy on a monthly basis. The median of the accuracy (middle bar) for all sky conditions has small variability with a tendency to show the lowest values during the summer (Figure 6
a). This is also evidenced during daytime (Figure 6
b) and it is less clear during nighttime (Figure 6
c). The largest variability is for the clear sky detection during daytime (Figure 6
b) that, in addition to the minimum during the summer, shows a maximum during the fall. The range of variability is about 20%. On the contrary, the median of the accuracy of nighttime retrievals shows low variability over the year (Figure 6
c). The monthly distributions during daytime (Figure 6
b) are wider for clear sky detections than for clouds or all sky conditions. However, this behavior is not recognized during nighttime (Figure 6
c) wherein the inter-quantile distance (percentile 75 minus percentile 25) is similar for both clouds and clear sky detections.
Further insights on the performance become evident analyzing the accuracy as a function of the grid row (Figure 7
). The accuracy under all sky conditions is higher than clear skies for all rows, or latitudes, under consideration (Figure 7
a). For clear sky retrievals there is a clear degradation of the accuracy for the rows higher than about 150 (square in Figure 1
, roughly parallel 36). These findings were already recognized in the spatial patterns of the accuracy (Figure 5
). The clear sky degradation as a function of latitude is a consequence of the daytime retrievals (Figure 7
b) since the accuracy of the nighttime retrievals shows smaller variations with latitude (Figure 7
c). The accuracy of daytime retrievals beyond grid row 250 (triangle in Figure 1
, roughly parallel 44) is very close to 0.5 which suggests small skill of the cloud mask in this subsample.
To further examine the origin of these uncertainties, Figure 8
shows the accuracy during daytime for the four seasons. Clearly, the erroneous retrievals dominating the previous behavior occur during winter (Figure 8
a). The spatial distribution (Figure 9
a) indicates this is a general behavior over the CONUS, with larger latitudinal contrast in the east. The summer season (Figure 8
c) shows little dependence on the latitude, although the accuracy is somewhat low (around 0.6). The spatial distribution reveals large spatial variability during this season (Figure 9
b). The Spring and Autumn (Figure 8
b,d) show intermediate behavior between the winter and summer seasons.
Nighttime conditions show small variability as a function of the latitude during the four seasons (Figure 10
), with remarkable similar behavior (accuracy generally higher than 0.8) of the cloud and clear sky retrievals during summer (Figure 10
To better understand the uncertainties of the ACM product, the cloud top height histogram from the CALIPSO retrievals is shown in Figure 11
. The distribution has two maxima (black solid line) one around 1 km above ground level (AGL), boundary layer clouds, and the other at around 12 km AGL, clouds at the top of the troposphere. Figure 11
also shows the histogram of the cloud hits/misses which clearly shows a missdetection of boundary layer clouds. This pattern of underestimating low clouds is found during both day and night retrievals as well as in the four seasons (not shown).
In order to understand the origin of the missdetections under clear sky conditions we use the ACHA product. First, we inspect the accuracy of the binary cloud mask calculated with the ACHA product. The all sky accuracy is 82.9 %. The accuracy of the cloud retrievals is 91.2% whereas the clear sky detection is 64.8%. These values are similar to the ones obtained with the ACM product but there is a decay in the clear sky detection (13.4%). This is partially related to the different spatial resolution of the ACHA product (10 km) with respect to the ACM product (2 km), with probably a small modulation due to the reduced spatial coverage of the ACHA product.
The histograms of the cloud top height retrievals from the ACHA product for those cases missclassified as clear sky are shown in Figure 12
. The distribution shows the maximum near the surface and a secondary one at about 2 km AGL (black line). These peaks are mainly associated with daytime retrievals (dashed black line) and nighttime retrievals (gray line). The daytime maxima near the surface is present during the four seasons whereas the nighttime maxima mostly occur during the summer (not shown). As with the cloud missdetections, the main discrepancies are associated with low level clouds.
The performance of the ACM product over CONUS has been examined. The evaluation has been performed comparing the binary cloud mask on a 9 km grid to CALIPSO retrievals on the same grid. Two years of retrievals have been used to provide a statistically robust characterization of the product. The probability of detection or accuracy under all sky conditions is 86.0% being the accuracy of cloud detection higher, 90.9 %, than the clear sky detection, 74.8%. The all sky detection and cloud detection are in agreement with [5
] that found an accuracy of 85.7% and 93.0%, respectively. The clear sky detection is larger than the 67.7% reported in [5
]. The values found in this evaluation should be more accurate considering the much longer evaluation period, and the better temporal match of observations ([5
] used the completion time of the retrievals for ACM and the nearest hour for the CALIPSO retrievals, allowing for 800 s differences between both time stamps). The lower performance of clear sky retrievals is a result of missdetections during daytime. This is especially clear for summer, and for regions to the north of parallel 36 during winter. The nighttime retrievals show similar performance of the cloud and clear sky retrievals.
The majority of the missdetections are associated with low clouds. The undetected clouds generally have the cloud top height within the first 2 km AGL, with the maximum around 1 km AGL. This is in agreement with [14
] findings over the region covered by the Meteosat. The cloud over-detection is also associated with low level clouds. In this case, there is a different behavior during daytime and nighttime. During daytime, the cloud top height distribution of the clouds overdetected shows a maximum near the ground and a rapid decrease with height. There are fewer overdetections for cloud top heights higher than 2 km. During nighttime, the maximum of the cloud top height distribution is around 2 km.
The main motivation for this assessment was to evaluate the potential of the ACM product for initializing the clouds in the MAD-WRF NWP model. MAD-WRF is a solar irradiance nowcasting system and thus it is important to have an accurate cloud initialization, especially during daytime. However, the ACM product shows superior performance during nighttime than during daytime. The performance of daytime retrievals to the north of parallel 36 is probably not adequate for this particular application. However, regions to the south of parallel 36 show acceptable performance during both daytime and nighttime. It is over these regions wherein the ACM product should show its largest potential to enhance the cloud initialization in the MAD-WRF model.
To assess these conclusions, Figure 13
shows the MAE calculated with global horizontal irradiance (GHI) hourly analyses performed for the month of April 2018 with both WRF-Solar and MAD-WRF. The GHI analysis was created using the 1 h forecasts from the High Resolution Rapid Refresh HRRR, [24
] model run operationally by the National Center for Environmental Prediction (NCEP). The 1 h forecasts is available about 20 min before its valid time which allows one to use it in nowcasting applications. Comparison is performed against the observations from the United States Climate Reference Network USCRN, [25
]. The WRF-Solar model shows MAE values larger than 100 W m
at most of the sites with values larger than 150 W m
at certain locations (Figure 13
a). These errors are reduced with the MAD-WRF analysis that imposes the GOES-16 cloud mask (Figure 13
b). The MAD-WRF analysis shows MAE values lower than 100 W m
at the majority of the sites to the south of parallel 36. To the north of this parallel, the errors are higher for the western part of the CONUS, wherein topography is more complex, and in the eastern part. The errors are lower in the center of CONUS. Although we are using only one month of data, the spatial distribution of the MAE is in good agreement with the KSS score of the ACM product (Figure 5
d). This further stresses the importance of having an accuracte retrieval to initialize the cloud field. Potential improvements in the performance of the GOES-16 ACM product should translate into a more accurate cloud initialization and thus an improved performance of the MAD-WRF model nowcasts.