Summertime Continental Shallow Cumulus Cloud Detection Using GOES-16 Satellite and Ground-Based Stereo Cameras at the DOE ARM Southern Great Plains Site

Summertime continental shallow cumulus clouds (ShCu) are detected using Geostationary Operational Environmental Satellite (GOES)-16 reflectance data, with cross-validation by observations from ground-based stereo cameras at the Department of Energy Atmospheric Radiation Measurement Southern Great Plains site. A ShCu cloudy pixel is identified when the GOES reflectance exceeds the clear-sky surface reflectance by a reflectance detection threshold of ShCu, ∆R. We firstly construct diurnally varying clear-sky surface reflectance maps and then estimate the ∆R. A GOES simulator is designed, projecting the clouds reconstructed by stereo cameras towards the surface along the satellite’s slanted viewing direction. The dynamic ShCu detection threshold ∆R is determined by making the GOES cloud fraction (CF) equal to the CF from the GOES simulator. Although there are temporal variabilities in ∆R, cloud fractions and cloud size distributions can be well reproduced using a constant ∆R value of 0.045. The method presented in this study enables daytime ShCu detection, which is usually falsely reported as clear sky in the GOES-16 cloud mask data product. Using this method, a new ShCu dataset can be generated to bridge the observational gap in detecting ShCu, which may transition into deep precipitating clouds, and to facilitate further studies on ShCu development over heterogenous land surface.


Introduction
Continental shallow cumulus clouds (ShCu) are tightly coupled with the land surface and have strong impacts on the surface energy partition and the turbulent transport of momentum, heat, and moisture [1][2][3][4][5][6]. They are also important for their seeding role in transitioning into hot-tower deep convective clouds with heavy precipitation [7]. ShCu are small in horizontal extent, of around a few hundred meters to several kilometers, and they are short-lived, lasting 15 to 30 min from initiation to dissipation [8][9][10][11][12][13][14]. With such spatiotemporal scale, ShCu cannot be explicitly resolved, but are highly parameterized in climate models. Systematic model biases in the surface energy balance and precipitation are often attributed to the parameterized processes associated with ShCu, e.g., the shallow-todeep convection transition and the interactions among land surface, atmospheric boundary layer, clouds, and precipitation [15,16].
To develop a theoretical understanding and improve parameterizations, comprehensive and continuous observations on ShCu are greatly needed, especially those resolving the diurnal variation in cloud life cycle and cloud size over a vast land area with varying (east/west) and 3000 km (north/south) rectangle over the CONUS. The 'red' visible band on ABI, 0.64 µm, has the finest spatial resolution (0.5 km at the nadir point). GOES-16 s high spatiotemporal resolution brings the new potential to detect most of the ShCu clouds continuously and simultaneously throughout their life cycle evolutions over a large region. However, there is no operational GOES-16 cloud product which can be easily used and dedicated to ShCu studies so far.
The aim of this study is to detect summertime continental ShCu by taking advantage of the high temporal and spatial resolutions of GOES-16 reflectance data and threedimensional cloud structures retrieved from ground-based stereo cameras. We will determine an optimal detection threshold for ShCu using a newly developed GOES simulator, which projects the stereo camera reconstructed clouds towards the surface along the satellite's slanted viewing direction. The new ShCu dataset from this study can bridge the gap aforementioned on ShCu observations. The remaining parts of the paper are organized as follows. Section 2 introduces the data in detail. Section 3 describes the procedures for ShCu detection. The results are presented in Section 4, the discussion in Section 5, and a summary in Section 6.

Data
The major datasets used in this study are from two kinds of instruments: (1) groundbased stereo cameras at the ARM SGP site and (2) ABI on GOES-16.
Ground-based stereo cameras were installed in three pairs at 6 km from the central facility at the ARM SGP site in Oklahoma. Using observations from these six stereo cameras, the Clouds Optically Gridded by Stereo (COGS) data product was developed [22]. The COGS product is a four-dimensional grid of cloudiness within a 6 km × 6 km × 6 km cube at a spatial resolution of 50 m and a temporal resolution of 20 s. Within that domain, the intersection of the fields of view of the six cameras forms a three-dimensional polygon that we will refer to here as the reconstructable region: this is the part of the domain in which clouds can be measured. The central location of the COGS product is at 36 • 36 19"N, 97 • 29 11"W. Compared to the widely used DOE ARM Active Remote Sensing of CLouds (ARSCL) product [5,10,45], which is developed by merging observations from multiple vertical pointing sensors (cloud radars, lidars, and ceilometers) [46], the COGS product improves cloud fraction and cloud size estimations. This is mainly because (1) the COGS product is not impacted by insects, while identifying and removing insect clutter from Kaband ARM Zenith pointing Radar (KAZR) observations is one of the most challenging tasks of the operational ARSCL processing [47,48]; (2) COGS provides a more representative fourdimensional grid of cloudiness, while KAZR and ceilometer are 'soda-straw' observations with very limited spatial coverage.
COGS data used in this study are from 27 days of ShCu observations during July and August in 2018 and 2019, satisfying the following criteria: (1) cloud fraction less than 0.6; (2) no surface precipitation reported by Arkansas-Red Basin River Forecast Center (ABRFC) data; and (3) cloud top under 4 km.
In this study, we use the reflectance data in the 'red' visible channel from GOES-16 ABI CONUS scans-the GOES-16 cloud and moisture imagery (CMI) data product [49,50]. CMI provides the dimensionless quantity "reflectance factor", which is defined as the top of atmosphere (TOA) Lambertian equivalent albedo multiplied by the cosine of the solar zenith angle. The reflectance used in this study is the red-band TOA reflectance (Lambertian equivalent albedo) [50]. The spatial resolution varies with the distance from the GOES nadir and is approximately 650 m at the SGP region. The temporal resolution is 5 min.
The ABI Clear Sky Mask (ACM) product is an official and operational cloud mask product for GOES-16. It uses data from nine channels and a combination of spatial, temporal, and spectral tests to label clear-sky and cloud-sky pixels [37,51]. It provides a binary mask (clear or cloudy) every 5 min (in CONUS scan), and the horizontal resolution of ACM is 2 km. Such a resolution is not adequate for the detection of ShCu clouds, most of which are much smaller than 2 km.
An example of ShCu cloud fields on 11 July 2018 observed from both ground-based stereo cameras and GOES-16 satellite is illustrated in Figure 1. The formulation to calculate cloud fraction is: mask (clear or cloudy) every 5 min (in CONUS scan), and the horizontal resolution of ACM is 2 km. Such a resolution is not adequate for the detection of ShCu clouds, most of which are much smaller than 2 km. An example of ShCu cloud fields on 11 July 2018 observed from both ground-based stereo cameras and GOES-16 satellite is illustrated in Figure 1. The formulation to calculate cloud fraction is:

=
Number of cloudy pixels in a certain region Number of total pixels in the same region (1) For COGS-3D (x-y-z) data of 50-m resolution, cloud thickness (2D, x-y) can be calculated. A two-dimensional cloud mask is generated where a pixel is labeled as cloudy if cloud thickness at the corresponding position is larger than zero. Figure 1a shows the daytime evolution of the vertical profile of cloud fraction from 18 to 23 UTC (12 to 17 local time) based on the COGS product. ShCu cloud fields undergo an initial formation of small and thin clouds before 19 UTC, a mature development with maximum cloud vertical extent and cloud fraction between 19 and 21 UTC, and a dissipation stage towards the end For COGS-3D (x-y-z) data of 50-m resolution, cloud thickness (2D, x-y) can be calculated. A two-dimensional cloud mask is generated where a pixel is labeled as cloudy if cloud thickness at the corresponding position is larger than zero. Figure 1a shows the daytime evolution of the vertical profile of cloud fraction from 18 to 23 UTC (12 to 17 local time) based on the COGS product. ShCu cloud fields undergo an initial formation of small and thin clouds before 19 UTC, a mature development with maximum cloud vertical extent and cloud fraction between 19 and 21 UTC, and a dissipation stage towards the end of the day. Figure 1b shows the cloud mask based on COGS data at 20:33:40 UTC (dash line on Figure 1a); during this time, three large clouds are clearly seen within the 6 km by 6 km COGS domain. Figure 1c shows the cloud reflectance from GOES-16 at 20:33:43 UTC on the same day over a large region in the vicinity of the SGP site, with the COGS domain denoted by the red box. The three shallow cumulus clouds on the COGS image ( Figure 1b) are shown as bright and high-reflectance objects on the GOES image ( Figure 1c). Figure 1d shows the cloud mask from the GOES operational ACM data product. It can be clearly seen that these three ShCu clouds seen from the COGS image ( Figure 1b) are not fully reported as clouds in the GOES ACM product (Figure 1d). Figure 1e shows our newly developed cloud mask with the successful detection of the three ShCu clouds in the COGS domain ( Figure 1b) using GOES-16 reflectance data (Figure 1c). The new method is discussed in detail in the following sections.

Methods
In the new method, a cloudy pixel is identified when the GOES reflectance exceeds the clear-sky surface reflectance (in the absence of clouds for that pixel) by a detection threshold of ShCu reflectance (∆R) [52,53]. This can be expressed as In the following, we first generate diurnally varying clear-sky surface reflectance maps, R clear (x,y,t), and then make estimates of the ShCu detection threshold, ∆R.

Determination of Clear-Sky Surface Reflectance
Clear-sky surface reflectance is determined at high resolution (ABI pixel size, approximately 650 m at SGP) based on an algorithm in [53] and [52] that we describe below. In [49], it was confirmed that the patterns of clear-sky radiation using this algorithm agree well with the MODIS albedo maps. The assumption for this algorithm is that land surfaces are stable and slow-varying over a restricted time period and that the signal measured by the GOES instrument during this period can be considered only a function of the scene-viewing geometry and cloudiness [54]. In this study, clear-sky surface reflectance climatology maps are generated for July-August based on data in 2018 and 2019. To reduce the diurnal variability effects from the solar zenith angle, while at the same time still retaining sufficient data, clear-sky surface reflectance climatology for July-August is calculated per pixel per hourly interval using 45 randomly selected days in July and August over the 2-year period. Figure 2a shows that, for each pixel, 540 reflectance values (45 days × 12 observations per hour) are sorted to generate a cumulative distribution frequency (CDF) on 21 UTC (15 local time) with the bin spacing of 0.01. The clear-sky surface reflectance is determined as the reflectance value with the greatest gradient along the curve of the CDF. The greatest gradient corresponds to the change from cloud shadow to clear-sky surface. The reflectance values smaller than the determined surface reflectance correspond to the cloud shadow [54]. The sample numbers in each bin are shown in Figure 2b. To generate the CDF, adequate clear-sky cases are needed. Among the randomly selected 45 days, manual case screening of satellite images confirms around half of the cases as clear sky. The spatial variabilities of the clear-sky surface reflectance are clearly seen within the COGS 6 km by 6 km domain (Figure 2c). The surface reflectance values in a 4 • by 3.5 • large domain centered at the COGS location are also calculated and shown in Figure 2d. It is found that the clear-sky surface reflectance agrees well with the land cover map (Figure 2e), in which higher surface reflectance is found over city and over grassland than over forest. The reflectance is small over open water in general, except for the solar salt plant region (−99.25 • , 36.75 • ) in Figure 2e. The land cover map for the year 2018 is from the Cropland Data Layer dataset (https://nassgeodata.gmu.edu/CropScape/ (accessed on 11 September 2020).) hosted by CropScape for the continental United States, which has a 30-m spatial resolution [55,56]. This surface reflectance method works well when the land surface surface conditions (e.g., vegetation) are stable and vary slowly. A more detailed discussion can be found in Section 5.

The GOES Simulator
The ShCu detection threshold ∆R is estimated from GOES reflectance by cross-validation with cloud fraction (CF) from stereo cameras at the ARM SGP site. To estimate an accurate ∆R, the surface camera data and the GOES satellite data should be aligned well to observe the same cloud fields. However, the parallax issue, the displacement of cloud location and the enhancement of cloud fraction/size, occurs in satellite imaging, particularly for the surface regions with high viewing zenith angles [57][58][59][60][61] (Figure 3a). Due to the 48° viewing angle of GOES-16 towards SGP, the parallax shift leads to a difference between cloud images from satellites looking at clouds in a slanted view and from the COGS data product projecting clouds to the surface in the zenith direction.

The GOES Simulator
The ShCu detection threshold ∆R is estimated from GOES reflectance by crossvalidation with cloud fraction (CF) from stereo cameras at the ARM SGP site. To estimate an accurate ∆R, the surface camera data and the GOES satellite data should be aligned well to observe the same cloud fields. However, the parallax issue, the displacement of cloud location and the enhancement of cloud fraction/size, occurs in satellite imaging, particularly for the surface regions with high viewing zenith angles [57][58][59][60][61] (Figure 3a). Due to the 48 • viewing angle of GOES-16 towards SGP, the parallax shift leads to a difference between cloud images from satellites looking at clouds in a slanted view and from the COGS data product projecting clouds to the surface in the zenith direction.
To better estimate ∆R for every satellite image during the ShCu days, a GOES simulator is designed using observations from stereo cameras, projecting clouds observed in the COGS reconstructable region along the slanted viewing direction of GOES-16 to the land surface. The projected surface for the GOES simulator is the same as the ellipsoid surface used in the GOES ABI data. The elevation of the SGP central facility above the GOES ellipsoid surface is 285 m after considering the terrain correction for the GOES satellite. The satellite viewing zenith angle and satellite azimuth angle are 48.64051 • and 145.23258 • at the COGS domain center (36.60529 • , −97.48642 • ). The satellite viewing zenith angle and the azimuth angle are assumed constant in the GOES simulator, which is reasonable because the dimension of the interested domain is very small compared with the Earthsatellite distance. The output from the GOES simulator is the mean path through clouds projected at each GOES ABI pixel based on COGS data every 20 s. The mean cloud path is calculated as the volume of the clouds intercepted by lines connecting the GOES satellite to locations in the GOES surface pixel, divided by the area of the GOES pixel. (a) Sketch of the projected location of a cloud on the Earth's surface, seen from nadir and from GOES with a satellite zenith angle (θ). The gray box represents the 6 km × 6 km × 6 km COGS domain, where ShCu can be observed by surface stereo camera system. A GOES simulator is developed, projecting clouds observed by surface stereo cameras towards the land surface along the GOES-16′s slanted viewing direction. Two example simulator slanted paths (green and orange lines) are provided in (a). For green path, the minimum and maximum heights are 0 km and 6 km. For orange path, the minimum height is around 3 km, much higher than 0 km. (b) The distance passing through the COGS reconstructable region along the path from the GOES-16 satellite to the GOES pixel at surface. The (c) minimum and (d) maximum heights of the COGS reconstructable region intercepted by paths from the GOES-16 satellite to the GOES surface pixel. (e) A mask of the GOES simulator. Each marked (black) point at surface in (e) corresponds to one GOES slanted viewing path. Along these marked slanted viewing paths, pixels below 6 km seen by GOES are all within the 6 km × 6 km × 6 km COGS domain. In (c) to (e), the red box is the actual location of the COGS domain, and the magenta box, as a reference, is a shifted domain of the same size, 0.015° north and west of the COGS domain, due to the GOES slanted view.
To better estimate ∆R for every satellite image during the ShCu days, a GOES simulator is designed using observations from stereo cameras, projecting clouds observed in the COGS reconstructable region along the slanted viewing direction of GOES-16 to the land surface. The projected surface for the GOES simulator is the same as the ellipsoid surface used in the GOES ABI data. The elevation of the SGP central facility above the (a) Sketch of the projected location of a cloud on the Earth's surface, seen from nadir and from GOES with a satellite zenith angle (θ). The gray box represents the 6 km × 6 km × 6 km COGS domain, where ShCu can be observed by surface stereo camera system. A GOES simulator is developed, projecting clouds observed by surface stereo cameras towards the land surface along the GOES-16 s slanted viewing direction. Two example simulator slanted paths (green and orange lines) are provided in (a). For green path, the minimum and maximum heights are 0 km and 6 km. For orange path, the minimum height is around 3 km, much higher than 0 km. (b) The distance passing through the COGS reconstructable region along the path from the GOES-16 satellite to the GOES pixel at surface. The (c) minimum and (d) maximum heights of the COGS reconstructable region intercepted by paths from the GOES-16 satellite to the GOES surface pixel. (e) A mask of the GOES simulator. Each marked (black) point at surface in (e) corresponds to one GOES slanted viewing path. Along these marked slanted viewing paths, pixels below 6 km seen by GOES are all within the 6 km × 6 km × 6 km COGS domain. In (c) to (e), the red box is the actual location of the COGS domain, and the magenta box, as a reference, is a shifted domain of the same size, 0.015 • north and west of the COGS domain, due to the GOES slanted view.
When GOES-16 looks towards surface through the original 6 km × 6 km × 6 km COGS domain in the slanted view, the distance of the path, the minimum (maximum) height along the path intercepted by the COGS reconstructable region, varies from GOES pixel to pixel (Figure 3b-d). Two example paths (green and orange lines) are provided in Figure 3a. For the green path, the minimum and maximum heights are 0 km and 6 km, while for the orange path, the minimum height is 2 km, which indicates that, along this GOES slanted viewing path, pixels observed by GOES below 2 km are all outside of the 6 km × 6 km × 6 km COGS domain. In other words, along this slanted path, GOES observes more clouds than the GOES simulator, if any cloud occurs in the orange shaded area in Figure 3a.
In order to best match clouds observed by the GOES satellite and GOES simulator based on COGS data, we need to select an analysis domain in which ShCu observed by GOES all come from the COGS reconstructable region. Using COGS observed shallow cumulus data during July to August in 2018 and 2019, we find that the shallow cumulus cloud base heights are all above 0.65 km, and 95% of the cloud top heights are lower than 2.55 km. Thus, a valid slanted path is selected with a minimum height lower than 0.65 km and a maximum height higher than 2.55 km. The corresponding pixels at the surface with valid slanted paths are labeled as black points in Figure 3e (we refer to this as the GOES simulator mask hereafter).
The COGS data product projects clouds reconstructed by stereo cameras at zenith direction toward the surface over a 6 km by 6 km domain. The location of projection is shown as the red box in Figure 3e. The GOES simulator projects clouds reconstructed by surface stereo cameras at the GOES slanted viewing direction. We found that all the GOES simulator marked points are within a 6 km by 6 km domain of the same size (magenta box in Figure 3e), which is 0.015 • north and west of the actual COGS domain. This indicates that the shifts from the actual COGS location to the satellite off-nadir displaced location are approximately 0.015 • in both N-S and W-E directions.

Determination of ShCu Detection Threshold ∆R
We estimate the ShCu detection threshold ∆R for each of the GOES-16 reflectance images based on the cloud fractions (CF) from the GOES simulator. The CF for the GOES simulator is calculated as the number of cloudy pixels divided by the total number of pixels within the GOES simulator mask region (Figure 3e). A cloud path distance threshold is needed to identify whether a pixel is clear sky or cloudy. This cloud path distance threshold is estimated for each GOES simulator image based on the COGS 20s data.
The estimation of the ShCu detection threshold ∆R is based on four images (Figure 4a-d): (Image A) a high-resolution (50 m) cloud thickness image from the COGS data; (Image B) a low-resolution (at GOES resolution) cloud thickness image derived from the highresolution COGS data; (Image C) a cloud path distance image from the GOES simulator; and (Image D) a reflectance difference image from the GOES-16.
Here, we introduce how to prepare these four images. For a given GOES image, one snapshot of 3D COGS data is selected closest to the time of the GOES image scan over the SGP central facility. Contiguous US images from GOES-16 scans are provided every 5 min. The GOES ABI CMI product does not provide the scan time stamp for individual pixels. The pixel scan time over the SGP central facility is calculated using the GOES calibration tool, an ABI time model look up table (available at https://www.star.nesdis.noaa.gov/ GOESCal/goes_tools.php (accessed on 18 March 2021)).
We then use this selected snapshot of COGS 3D data to generate two cloud thickness images (Images A and B). At each (x, y) location in the COGS domain, the cloud physical thickness is calculated (Figure 4a, Image A). The high-resolution (50 m) COGS cloud thickness is then averaged within the GOES ABI pixel (approximately 650-m resolution at SGP) to generate a low-resolution COGS cloud thickness image (Figure 4b, Image B). Figure 4c (Image C) is the cloud path distance map from the GOES simulator based on the selected snapshot of COGS data. The reflectance in Figure 4d (Image D) is calculated as the difference between the GOES reflectance and the clear-sky surface reflectance at the 0.64 µm channel. Using these four images, Figure 4e summarizes the assumptions, major steps, and datasets used to estimate ∆R in three steps. The mean distance through clouds observed by stereo cameras (the COGS data) along the path from GOES-16 satellite to the GOES surface pixel. (d) Reflectance difference from GOES data, which is the difference between GOES reflectance and clearsky surface reflectance at the 0.64 μm "red" band. The red box is the actual COGS domain, and the magenta box is a domain of the same size, 0.015° north and west of the COGS domain. (e) A flowchart to summarize the assumptions, major steps, and datasets used to estimate ΔR, a reflectance threshold to identify a cloudy pixel in the satellite data. A cloudy pixel on satellite image is defined when the GOES reflectance exceeds the clear-sky surface reflectance by a certain value, e.g., ΔR.
Here, we introduce how to prepare these four images. For a given GOES image, one snapshot of 3D COGS data is selected closest to the time of the GOES image scan over the SGP central facility. Contiguous US images from GOES-16 scans are provided every 5 min. The GOES ABI CMI product does not provide the scan time stamp for individual pixels. We then use this selected snapshot of COGS 3D data to generate two cloud thickness images (Images A and B). At each (x, y) location in the COGS domain, the cloud physical (d) Reflectance difference from GOES data, which is the difference between GOES reflectance and clear-sky surface reflectance at the 0.64 µm "red" band. The red box is the actual COGS domain, and the magenta box is a domain of the same size, 0.015 • north and west of the COGS domain. (e) A flowchart to summarize the assumptions, major steps, and datasets used to estimate ∆R, a reflectance threshold to identify a cloudy pixel in the satellite data. A cloudy pixel on satellite image is defined when the GOES reflectance exceeds the clear-sky surface reflectance by a certain value, e.g., ∆R.
Firstly, we determine a cloud thickness threshold on the low-resolution cloud thickness image (Image B). From the high-resolution cloud thickness image (Image A), a reference cloud fraction (CF COGS_HighRes ) is retained by defining a cloudy pixel with the cloud physical thickness larger than zero. From the low-resolution cloud thickness image, a simple 1D look up table is generated: for a given thickness threshold, we count the number of cloud pixels with cloud thickness greater than this threshold and calculate the corresponding CF (CF COGS_LowRes ). The final thickness threshold is determined at the low-resolution COGS grid as the value making the CF COGS_LowRes equal to CF COGS_HigRes .
Secondly, we determine cloudy pixels on the cloud path distance map from the GOES simulator (Figure 4c, Image C) using the thickness threshold from the step above. If the distance through clouds is larger than the cloud thickness threshold, we identify this pixel as cloudy, and then we calculate the CF (CF simulator ) as the ratio of the number of cloudy pixels over the number of total pixels in the GOES simulator mask region (Figure 3e).
Lastly, we determine the ShCu detection threshold ∆R. A simple 1D look up table is generated: for a given reflectance detection threshold, ∆R, the number of cloud pixels is identified with the reflectance greater than this threshold and the CF (CF GOES ) is calculated on the reflectance difference images (Figure 4d, Image D) in the GOES simulator mask region (Figure 3e). The final reflectance detection threshold, ∆R, is determined as the value making the CF GOES equal to CF simulator .
To estimate an accurate ∆R, the cloud images from the GOES simulator and the GOES reflectance data should be aligned well to observe the same cloud fields as much as possible. GOES can observe multi-layer clouds, e.g., ShCu in COGS cube overlapping with high clouds above 6 km outside the COGS cube; however, COGS can only observe clouds under 6 km. In other words, clouds higher than 6 km outside the COGS reconstructable region can be observed by GOES but not by the GOES simulator. To best facilitate an apple-toapple comparison of cloud fractions for an accurate determination of ∆R, in the last step aforementioned, we further (1) select images with GOES mean reflectance less than 0.2 to exclude a few images with multi-layer clouds; (2) select the pair of GOES simulator and GOES images with their normalized cross-correlation larger than 0.5. Normalized cross-correlation, usually its two-dimensional version, is a metric to evaluate the degree of similarity between two images [57]. Two-dimensional fast Fourier transform convolutions of the image are used to compute the correlation to make the calculation faster [58].

Results
In this section, we firstly show the results using the estimated dynamic ∆R based on each of the satellite images. The estimation of these dynamic ∆Rs depends on the availability of observations from ground-based stereo cameras as a constraint. If there are no ground-based observations, can we use a constant threshold ∆R to detect ShCu? In the following, we pursue an optimal constant reflectance threshold. Based on this optimal constant reflectance threshold, we further provide the cloud size distribution and the diurnal cycles of cloud fraction and cloud size.

Dynamic and Best-Fit Constant ∆R
The distributions of the dynamic ShCu cloud detection threshold ∆R are shown in Figure 5a based on the CF comparison of the 773 pairs of GOES reflectance difference images and GOES simulator images during the ShCu days. ∆R varies from 0.01 to 0.2, with 83% of the data samples between 0.02 and 0.07 and a mode value at 0.04.
The best-fit constant ∆R is determined as the value making the smallest difference in CF between using the dynamic ∆R values and the constant ∆R. A mean bias is calculated to quantify the difference. In Figure 5b, the mean bias of CF approaches zero when ∆R is 0.045, suggesting that the best-fit constant ∆R is 0.045.
To evaluate the performance of this best-fit ∆R, we calculate the percent error (PE) as where y d,i and y c,i are the CFs of the i th image estimated using dynamic and constant ∆R values [59]. Figure 5c shows the distributions of probability density function (PDF) and cumulative density function (CDF) of the CF percent errors using constant ∆R = 0.045. The CDF of PE (Figure 5c green line) shows that the CF errors of half of the data (CDF below 50%) are less than 20%, and the CF errors of another 1/3 of the data (CDF from 50% to 80%) are between 20% and 40%. Further discussion of the percent errors can be found in Section 5. images and GOES simulator images during the ShCu days. ΔR varies from 0.01 to 0.2, with 83% of the data samples between 0.02 and 0.07 and a mode value at 0.04. The best-fit constant ΔR is determined as the value making the smallest difference in CF between using the dynamic ΔR values and the constant ΔR. A mean bias is calculated to quantify the difference. In Figure 5b, the mean bias of CF approaches zero when ΔR is 0.045, suggesting that the best-fit constant ΔR is 0.045.
To evaluate the performance of this best-fit ΔR, we calculate the percent error (PE) as = 100 * | , − , |/ , , where , and , are the CFs of the ith image estimated using dynamic and constant ΔR values [59]. Figure 5c shows the distributions of probability density function (PDF) and cumulative density function (CDF) of the CF percent errors using constant ΔR = 0.045. The

Cloud Size Distributions
In addition to cloud fraction, the size of each individual cloud is also estimated from the GOES simulator and the GOES reflectance difference images. Cloud size is defined as the square root of the projected surface area, which is calculated by the number of cloudy pixels times the area per pixel. For the purpose of showing the GOES detectability of small clouds, the first nine bins are shown as the corresponding cloud sizes estimated from clouds composed of 1 to 9 pixels on the satellite image. Above 2.0 km, the bin width is 0.2 km. Figure 6 confirms that using a constant ∆R of 0.045 to detect ShCu can still maintain a very similar cloud size distribution as that derived using dynamic ∆R values. It slightly underestimates (overestimates) the probabilities of clouds, which are composed of one pixel (two pixels) on the satellite image. When the clouds are composed of more than three cloudy pixels, there are minor differences in the probabilities of cloud sizes estimated from the GOES simulator and GOES data using the dynamic and constant ∆R (0.045) threshold. clouds, the first nine bins are shown as the corresponding cloud sizes estimated from clouds composed of 1 to 9 pixels on the satellite image. Above 2.0 km, the bin width is 0.2 km. Figure 6 confirms that using a constant ΔR of 0.045 to detect ShCu can still maintain a very similar cloud size distribution as that derived using dynamic ΔR values. It slightly underestimates (overestimates) the probabilities of clouds, which are composed of one pixel (two pixels) on the satellite image. When the clouds are composed of more than three cloudy pixels, there are minor differences in the probabilities of cloud sizes estimated from the GOES simulator and GOES data using the dynamic and constant ΔR (0.045) threshold. Figure 6. Cloud size distributions based on all the shallow cumulus cases from re-gridded low-resolution COGS, GOES simulator, and GOES data using dynamic and constant ΔR threshold (0.045). The first nine bins correspond to the cloud sizes estimated from clouds composed of 1 and 9 pixels on the satellite image. Above 2.0 km, the bin width is 0.2 km.
The cloud size distribution from COGS at GOES resolution (low-resolution COGS) is also plotted in Figure 6. The difference between the distributions derived from GOES and COGS LowRes may be because (1) the cloud sizes are enlarged when seen from GOES at a slanted view compared to the cloud sizes projected in the zenith view from the COGS data product; (2) the smaller clouds can be partially or fully blocked by larger clouds from the GOES simulator due to the slanted view. Because of the two effects mentioned above, the CFs from nadir view and from slanted view are not the same. Figure 7a shows an example of a CF diurnal cycle estimated using the constant ΔR (0.045) within the GOES simulator mask on 11 July 2018 from 18 to 23 UTC. The CFs are compared with CFs from the GOES operational ACM data product at 2 km spatial and 5min temporal resolutions. As seen in Figure 7a, in many cases, GOES ACM will not report cloudy pixels when the CFs are around or lower than 0.1. Figure 7b shows the evolution of cloud sizes. The mean and median values of the cloud sizes increase from 18 to 20 UTC (12 to 14 local time) and then decrease after 20 UTC. In line with the GOES ACM results shown in Figure 7a, it is found that, when the cloud sizes are smaller than 1 km, the operational GOES ACM product will tend to report pixels as clear sky instead of clouds. The cloud size distribution from COGS at GOES resolution (low-resolution COGS) is also plotted in Figure 6. The difference between the distributions derived from GOES and COGS LowRes may be because (1) the cloud sizes are enlarged when seen from GOES at a slanted view compared to the cloud sizes projected in the zenith view from the COGS data product; (2) the smaller clouds can be partially or fully blocked by larger clouds from the GOES simulator due to the slanted view. Because of the two effects mentioned above, the CFs from nadir view and from slanted view are not the same. Figure 7a shows an example of a CF diurnal cycle estimated using the constant ∆R (0.045) within the GOES simulator mask on 11 July 2018 from 18 to 23 UTC. The CFs are compared with CFs from the GOES operational ACM data product at 2 km spatial and 5-min temporal resolutions. As seen in Figure 7a, in many cases, GOES ACM will not report cloudy pixels when the CFs are around or lower than 0.1. Figure 7b shows the evolution of cloud sizes. The mean and median values of the cloud sizes increase from 18 to 20 UTC (12 to 14 local time) and then decrease after 20 UTC. In line with the GOES ACM results shown in Figure 7a, it is found that, when the cloud sizes are smaller than 1 km, the operational GOES ACM product will tend to report pixels as clear sky instead of clouds.

Discussions
In this section, we discuss the performance of the best-fit constant ΔR value of 0.045 and the applicability of this shallow cumulus detection threshold.
We first conduct several sensitivity tests on the constant detection threshold ΔR. Figure 5c also shows the distributions of the percent errors of CFs from satellite data estimated using 0.035 and 0.055 in addition to 0.045. The larger probability for a small percent error, the better the result. We found that there is a smaller PDF (and CDF) with percent errors smaller than 10% (first bin in x-axis) when using constant ΔR values of 0.035 or 0.055 as compared with using 0.045, which suggests the better performance of 0.045.
Using 5 min data, the mean CF bias using different ΔR values is further calculated with respect to the different ranges of in-cloud thickness, which is the mean thickness of the cloudy pixels on each COGS image (Figure 8a). In general, the mean bias is very close to zero when ΔR equals 0.045, much smaller than when using 0.035 or 0.055. The mean bias is negative (positive) when the in-cloud thickness is less (greater) than 150 m, but the CF mean bias values are always within ±0.013. The mean CF bias values are also calculated in the different CF ranges (Figure 8b). Using ΔR as 0.045, the mean biases are within ±0.02 when CF is less than 0.4. When CF is larger than 0.4, using a smaller ΔR (e.g., 0.035) yields better results than using 0.045. However, the occurrence probability of continental shallow cumulus CF larger than 0.4 is very small [11,62]. In the comparisons of hourly mean CFs estimated using dynamic and constant ΔR thresholds (Figure 8c), the mean biases are within ±0.01 in all the CF ranges when using ΔR of 0.045.

Discussions
In this section, we discuss the performance of the best-fit constant ∆R value of 0.045 and the applicability of this shallow cumulus detection threshold.
We first conduct several sensitivity tests on the constant detection threshold ∆R. Figure 5c also shows the distributions of the percent errors of CFs from satellite data estimated using 0.035 and 0.055 in addition to 0.045. The larger probability for a small percent error, the better the result. We found that there is a smaller PDF (and CDF) with percent errors smaller than 10% (first bin in x-axis) when using constant ∆R values of 0.035 or 0.055 as compared with using 0.045, which suggests the better performance of 0.045.
Using 5 min data, the mean CF bias using different ∆R values is further calculated with respect to the different ranges of in-cloud thickness, which is the mean thickness of the cloudy pixels on each COGS image (Figure 8a). In general, the mean bias is very close to zero when ∆R equals 0.045, much smaller than when using 0.035 or 0.055. The mean bias is negative (positive) when the in-cloud thickness is less (greater) than 150 m, but the CF mean bias values are always within ±0.013. The mean CF bias values are also calculated in the different CF ranges (Figure 8b). Using ∆R as 0.045, the mean biases are within ±0.02 when CF is less than 0.4. When CF is larger than 0.4, using a smaller ∆R (e.g., 0.035) yields better results than using 0.045. However, the occurrence probability of continental shallow cumulus CF larger than 0.4 is very small [11,62]. In the comparisons of hourly mean CFs estimated using dynamic and constant ∆R thresholds (Figure 8c By using constant ΔR 0.045 as the detection threshold, we further investigate the detection probability with respect to cloud fraction and cloud thickness. Within the COGS domain, we calculate the in-cloud thickness and cloud fraction for each GOES ABI grid based on the high-resolution (50 m) COGS data product. A cloudy pixel in the 50-m-resolution COGS product is defined as a cloud thickness larger than 0. The grid cloud fraction is calculated as the ratio of the number of cloudy pixels over the number of total pixels within one GOES ABI grid.
For each GOES ABI grid, satellite reflectance data at each grid will be used to identify it as either cloudy or clear sky using the ΔR threshold of 0.045. Figure 9a shows the joint probability of the identified cloudy pixels at different bins of in-cloud thickness and cloud fraction. Note that probabilities are only shown in the bins with sample numbers ( Figure  9b) larger than 100. The probabilities larger than 50% (in reddish colors) indicate that the grid is prone to be identified as cloudy rather than clear sky. As shown in Figure 9a, when the cloud thickness is smaller than 250 m, 50% probability can be achieved with lower cloud fractions as cloud thickness increases. For example, the 50% probability is found for a cloud thickness of 100 m (second bin in x-axis) with a cloud fraction of 0.5, and for a cloud thickness of 150 (200) meters with a cloud fraction of 0.4 (0.3), while for a cloud thickness of 250 m with a cloud fraction of 0.2, 60% of the cases in that bin are identified as cloudy. When the cloud thickness is greater than 250 m, the cases in those bins are usually also associated with high cloud fractions, e.g., 0.5 or above. Moreover, for those bins, usually 60% or more of the cases will be identified as cloudy. By using constant ∆R 0.045 as the detection threshold, we further investigate the detection probability with respect to cloud fraction and cloud thickness. Within the COGS domain, we calculate the in-cloud thickness and cloud fraction for each GOES ABI grid based on the high-resolution (50 m) COGS data product. A cloudy pixel in the 50-mresolution COGS product is defined as a cloud thickness larger than 0. The grid cloud fraction is calculated as the ratio of the number of cloudy pixels over the number of total pixels within one GOES ABI grid.
For each GOES ABI grid, satellite reflectance data at each grid will be used to identify it as either cloudy or clear sky using the ∆R threshold of 0.045. Figure 9a shows the joint probability of the identified cloudy pixels at different bins of in-cloud thickness and cloud fraction. Note that probabilities are only shown in the bins with sample numbers (Figure 9b) larger than 100. The probabilities larger than 50% (in reddish colors) indicate that the grid is prone to be identified as cloudy rather than clear sky. As shown in Figure 9a, when the cloud thickness is smaller than 250 m, 50% probability can be achieved with lower cloud fractions as cloud thickness increases. For example, the 50% probability is found for a cloud thickness of 100 m (second bin in x-axis) with a cloud fraction of 0.5, and for a cloud thickness of 150 (200) meters with a cloud fraction of 0.4 (0.3), while for a cloud thickness of 250 m with a cloud fraction of 0.2, 60% of the cases in that bin are identified as cloudy. When the cloud thickness is greater than 250 m, the cases in those bins are usually also associated with high cloud fractions, e.g., 0.5 or above. Moreover, for those bins, usually 60% or more of the cases will be identified as cloudy. In this study, we suggest using GOES ABI 'red' channel reflectance with a constant cloud detection threshold (0.045) to detect ShCu clouds at SGP. However, further validations might be needed at other locations to test the universal applicability of such ShCu detection threshold values. Our study is designed for the ShCu detection in particular, which is a missing part in the operational GOES ABI cloud mask, rather than to provide a method to detect all kinds of clouds under different conditions. The shallow cumulus cloud detection threshold may not work well for multi-layer clouds, such as cirrus clouds overlapping with ShCu. Case selection by screening satellite images is necessary to ensure that there are no higher-level clouds above lower-level ShCu, such that the clouds detected using our method are valid shallow cumulus. Additional observations from the GOES ABI 1.38 channel can be useful to detect cirrus clouds [63].
The procedure to generate the clear-sky reflectance maps can be applied to other locations over land with prevailing ShCu populations. However, it does have some limitations. It is not suitable for surfaces with very high reflectance. For example, fresh snow cover during wintertime has high visible reflectance similar to the reflectance of clouds, which may lead to large uncertainty in distinguishing between the clear-sky values and the cloud detection. Moreover, short-term changes in the land surface properties must be considered, if any. For example, during May and June, the winter wheat harvest takes place in the SGP region. The reflectance of the winter wheat will change before and after the harvest. Cloud detection cannot be based on only one surface reflectance map for the SGP warm season. Another example is the Eufaula Lake (−96°-95.5°; 35°-35.5°), labeled as open water in Figure 2e, which is impounded by the Eufaula dam. The lake reflectance changes when the Eufaula dam provides flood control, as the reflectance of turbid and clear water in the Eufaula Lake at the visible channel is very different [64]. The surface reflectance method introduced in this study may not work well with such large variation over a lake with a dam during flooding season. In our study, because the vegetation change is not significant in July and August at SGP, the surface reflectance map was generated using four months of data in July and August from 2018 and 2019. However, the accuracy of the estimated surface reflectance could be improved by using observations from a running time period, such as using 60-day observations to estimate the surface reflectance [54]. In this study, we suggest using GOES ABI 'red' channel reflectance with a constant cloud detection threshold (0.045) to detect ShCu clouds at SGP. However, further validations might be needed at other locations to test the universal applicability of such ShCu detection threshold values. Our study is designed for the ShCu detection in particular, which is a missing part in the operational GOES ABI cloud mask, rather than to provide a method to detect all kinds of clouds under different conditions. The shallow cumulus cloud detection threshold may not work well for multi-layer clouds, such as cirrus clouds overlapping with ShCu. Case selection by screening satellite images is necessary to ensure that there are no higher-level clouds above lower-level ShCu, such that the clouds detected using our method are valid shallow cumulus. Additional observations from the GOES ABI 1.38 channel can be useful to detect cirrus clouds [63].
The procedure to generate the clear-sky reflectance maps can be applied to other locations over land with prevailing ShCu populations. However, it does have some limitations. It is not suitable for surfaces with very high reflectance. For example, fresh snow cover during wintertime has high visible reflectance similar to the reflectance of clouds, which may lead to large uncertainty in distinguishing between the clear-sky values and the cloud detection. Moreover, short-term changes in the land surface properties must be considered, if any. For example, during May and June, the winter wheat harvest takes place in the SGP region. The reflectance of the winter wheat will change before and after the harvest. Cloud detection cannot be based on only one surface reflectance map for the SGP warm season. Another example is the Eufaula Lake (−96 • -95.5 • ; 35 • -35.5 • ), labeled as open water in Figure 2e, which is impounded by the Eufaula dam. The lake reflectance changes when the Eufaula dam provides flood control, as the reflectance of turbid and clear water in the Eufaula Lake at the visible channel is very different [64]. The surface reflectance method introduced in this study may not work well with such large variation over a lake with a dam during flooding season. In our study, because the vegetation change is not significant in July and August at SGP, the surface reflectance map was generated using four months of data in July and August from 2018 and 2019. However, the accuracy of the estimated surface reflectance could be improved by using observations from a running time period, such as using 60-day observations to estimate the surface reflectance [54].
Although further investigation may be needed to confirm its applicability to other regions, this new method enables the detection of small and short-lived ShCu during daytime. This ShCu detection method makes it feasible to trace the deep convective clouds back to their "seeds", the shallow cumulus. Such capability bridges the gaps in observational studies, especially in detecting the initiating life stages in shallow to deep convection transition. The method can provide a large sample set of ShCu clouds at the SGP region from GOES observations both occurring over different locations and in different stages of their life cycles. It allows us not only to analyze ShCu cloud object characteristics but also to identify relationships among cloud size, rain likelihood, the environmental conditions, and land surface properties. Such data may improve the general understanding of rain production and help to constrain ShCu clouds and their impacts in climate models.
The framework developed in this study can be applied to other satellite sensors, such as the Advanced Himawari Imager (AHI) onboard Himawari-8, if similar observations (e.g., stereo cameras) are available at ground level. When there are more ShCu cases observed by ground-based stereo cameras, especially those with GOES-16 and GOES-17 concurrently, further improvement of ShCu detection may be possible by using a dual-satellite (GOES-16 and GOES-17) approach with simultaneous satellite measurements.

Conclusions
In this study, summertime continental ShCu are detected by taking advantage of the high temporal and spatial resolutions of GOES-16 reflectance data and three-dimensional cloud structures retrieved from ground-based stereo cameras. A ShCu cloudy pixel is identified when GOES reflectance exceeds the clear-sky surface reflectance by a detection threshold of ShCu reflectance ∆R. We first derive the diurnally varying clear-sky surface reflectance maps. Next, the detection threshold for shallow cumulus (∆R) is determined according to the accurate cloud statistics from the surface stereo cameras. When images from satellite and ground-based stereo cameras are compared, the effect of the cloud parallax issue on GOES satellite images is carefully considered through a newly developed GOES simulator, which projects stereo camera reconstructed clouds towards the surface along the GOES-16 s slanted viewing direction. The dynamic ShCu detection threshold ∆R is determined by making the GOES CF equal to the CF from the GOES simulator.
When ground-based observations are not available as a constraint to estimate the dynamic reflectance threshold ∆R, similar cloud size distributions and cloud fractions can be reproduced using a constant cloud detection threshold (0.045) at the ARM SGP site. The best-fit constant ∆R (0.045) is determined as the value causing the smallest CF difference between using the dynamic ∆R values and the constant ∆R. Several sensitivity tests on the constant detection threshold ∆R confirm the better performance when using ∆R of 0.045 than when using other values (e.g., 0.035 or 0.055). Cloud size distribution, and the diurnal cycles of cloud fraction and cloud size, are also derived to show a reasonable performance of 0.045 as the shallow cumulus detection threshold ∆R. The mean biases for hourly mean cloud fraction are within 1 percent compared with the results using the dynamic ∆R. Detection of the low-level, small, and short-lived ShCu during daytime is a missing feature in the current GOES cloud mask product. Using the method presented in this study, a new ShCu dataset can be generated at SGP to bridge the observational gap in detecting ShCu, which may transition into deep precipitating clouds, and to facilitate further studies on the ShCu development over a heterogenous land surface. acknowledges support from NOAA grant NA19NES4320002 (Cooperative Institute for Satellite Earth System Studies-CISESS) at the University of Maryland/ESSIC.