Toward Long-Term Aquatic Science Products from Heritage Landsat Missions

: This paper aims at generating a long-term consistent record of Landsat-derived remote sensing reﬂectance ( R rs ) products, which are central for producing downstream aquatic science products (e.g., concentrations of total suspended solids). The products are derived from Landsat-5 and Landsat-7 observations leading to Landsat-8 era to enable retrospective analyses of inland and nearshore coastal waters. In doing so, the data processing was built into the SeaWiFS Data Analysis System (SeaDAS) followed by vicariously calibrating Landsat-7 and -5 data using reference in situ measurements and near-concurrent ocean color products, respectively. The derived R rs products are then validated using (a) matchups using the Aerosol Robotic Network (AERONET) data measured by in situ radiometers, i.e., AERONET-OC, and (b) ocean color products at select sites in North America. Following the vicarious calibration adjustments, it is found that the overall biases in R rs products are signiﬁcantly reduced. The root-mean-square errors (RMSE), however, indicate noticeable uncertainties due to random and systematic noise. Long-term (since 1984) seasonal R rs composites over 12 coastal and inland systems are further evaluated to explore the utility of Landsat archive processed via SeaDAS. With all the qualitative and quantitative assessments, it is concluded that with careful algorithm developments, it is possible to discern natural variability in historic water quality conditions using heritage Landsat missions. This requires the changes in R rs exceed maximum expected uncertainties, i.e., 0.0015 [1/sr], estimated from mean RMSEs associated with the matchups and intercomparison analyses. It is also anticipated that Landsat-5 products will be less susceptible to uncertainties in turbid waters with R rs (660) > 0.004 [1/sr], which is equivalent of ~1.2% reﬂectance. Overall, end-users may utilize heritage R rs products with “ﬁtness-for-purpose” concept in mind, i.e., products could be valuable for one application but may not be viable for another. Further research should be dedicated to enhancing atmospheric correction to account for non-negligible near-infrared reﬂectance in CDOM-rich and extremely turbid waters. non-negligible near-infrared reﬂectance in extremely turbid or CDOM-rich waters, and (c) characterizing adjacency effects under various environmental/climatic conditions.


Introduction
The Landsat program is among the longest Earth-observing programs on record, with continuous operations dating from 1972. The primary goal for the Landsat series is to monitor Earth resources from space using best-available technology. Following successful launches and operations of Landsat 1, 2, and 3, which carried analogue cameras [1], Landsat-4 and -5 carried the Thematic Mapper (TM), sampling, six spectral bands, 8-bit radiometric resolution, and a single thermal channel. A similar technology with some improvements [2] was utilized for the Enhanced Thematic Mapper Plus (ETM+) launched in 1999. The TM and ETM+ were whiskbroom imagers with 16 detectors laid out in the along-track direction [3]; thus, with each mirror scan, ~480 m is swept on the ground. The short dwell-time together with 8-bit radiometric resolution, however, yields low signal-to-noise ratios (SNR) for dark surfaces such as water bodies [4,5]. The two instruments carried stimulation lamps to allow for onboard radiance calibrations [6][7][8]. In addition to onboard calibration observations, multiple desert sites have historically been used to monitor long-term stability of these missions. For reference, the relative spectral responses (RSR) for TM and ETM+ in the visible and the near infrared (NIR) regions alongside those of the most recent Landsat sensor, i.e., the Operational Land Imager (OLI), are illustrated in Figure 1.
Currently available ocean color satellites are not suitable for long-term monitoring of estuarine environments. First, well-calibrated, dedicated ocean color sensors (e.g., the Sea-viewing Wide Fieldof-view Sensor (SeaWiFS)), have only become available since the late 1990s, so longer term historical studies [9] are not possible. Second, ocean color missions are generally equipped with 1 km nadir ground sampling distance [10]; therefore, lack sufficient spatial resolution to capture the spatial variability necessary to observe critical features, such as upwelling, small-scale eddies/blooms, river plumes, tidal effects, and local gradients in inland and nearshore waters. Third, due to steep gradients and high variability in optical properties as well as interfering optical signals from high concentrations of particles, chlorophyll-a (Chl), and colored dissolved organic matter (CDOM) in these regions, a clear delineation of these water quality properties is difficult without careful calibration and validation with in situ data [11]. Moreover, although standard, operational ocean color algorithms are well-suited for ocean color imagers (e.g., SeaWiFS) for relatively clear ocean waters, to examine inland and nearshore coastal waters, the development of algorithms customized to specific aquatic systems of interest is necessary for Landsat-type (broad-band) sensors.
The long record of Landsat observations provides unprecedented synoptic views of Earth resources; a record that has been substantially under-utilized in aquatic studies. Recognizing that the TM and ETM+ sensors (a) were equipped with only three visible bands (Figure 1), (b) lack sufficient SNR [5,12], and (c) suffer from artifacts (e.g., memory effects, striping) [13], they still have been found useful in some aquatic studies. These studies have primarily revolved around quantifying concentrations of total suspended solids (TSS) [14][15][16][17][18][19][20][21][22][23]. Attempts have also been made to quantify Chl [24][25][26][27] and the absorption by CDOM [28][29][30].  Further applications include analyzing bottom properties and bathymetry [31][32][33][34][35]. Many studies have also processed and examined time-series of images to assess feasibility of using Landsat for monitoring applications [9,27,[36][37][38][39][40][41][42][43][44][45]. Although the utilization of the Landsat archive has recently increased (since 2008 when the entire archive was made available to public at no cost [46]), the invaluable historic image data have not yet been fully exploited. To date, every retrospective study has followed a different strategy for data processing and/or atmospheric corrections [9,47]. Some studies have not even applied atmospheric corrections and used Level-1 (top-of-atmosphere; TOA) radiance (L t ) to evaluate water quality [36]. In general, over-water TM and ETM+ images are expected to contain relatively large uncertainties (compared to those of ocean color missions, like the Moderate Resolution Imaging Spectroradiometer (MODIS) [48][49][50]); however, with a common processing system and vicarious corrections to Level-1B data, we anticipate that not only can retrospective studies be conducted but also current and future research can be traced back to the past.
Owing to the successful integrations of Landsat-8 [51] and Sentinel-2 [52] processing in the SeaWiFS Data Analysis System (SeaDAS), we intend to expand this capability to Landsat-5 and -7 (to the extent possible within the limitations of instrument radiometric performances). More specifically, we implement a physics-based and extensively validated atmospheric correction method to generate Landsat products for the science community; enabling long-term studies of water quality in inland and nearshore coastal areas. Therefore, we first implement the atmospheric correction method (Section 2.1) through which non-negligible water-leaving radiance (L w ) in the near-infrared (NIR) band [53] is accounted for using a slightly different strategy than what is currently built into SeaDAS for Landsat-8 and Sentinel-2 [54]. Then, to reduce existing calibration biases in TOA observations of TM and ETM+, we perform vicarious calibrations (Section 2.2) using reference ocean color data [55,56]. Due to uncertainties in the vicarious calibrations, intercomparisons/cross-calibrations are further conducted (Section 2.3) to minimize residual biases and ensure consistency in Level-2 products, i.e., remote sensing reflectance (R rs ), with those derived from more recent missions. Sections 3.1 and 3.2 provide qualitative and quantitative assessments of TM-and ETM+-derived R rs products. The quantitative evaluation is carried out using the Aerosol Robotic Network (AERONET) data measured by in situ ocean color radiometers, i.e., AERONET-OC. Here, the remote sensing reflectance is defined as the ratio of L w to the total downwelling irradiance just above the water surface. Since AERONET-OC sites do not adequately represent aquatic/atmospheric regimes in inland areas, the heritage R rs products are further cross-validated against those derived from ocean color sensors over relatively large inland waters. Further, time-series of R rs products over 12 aquatic systems across the continental United States are analyzed (Section 3.3) to examine the utility of products derived from the three Landsat missions, i.e., TM, ETM+, and OLI. Further details on the limitations of heritage Landsat missions for retrospective water quality studies in inland and nearshore coastal areas are provided in Section 4.

Materials and Methods
Although the atmospheric correction methodology [57,58] and its integration [51] have been described in previous studies, we provide a brief description of how L w (NIR) is accounted for in the TM and ETM+ processing (Section 2.1). This section continues with computing vicarious calibration gains for TM and ETM+ data. This is an essential step towards minimizing biases in products over bodies of water [55,56], which has proven to enhance the overall quality of Landsat-8 and Sentinel-2A R rs products [52,59]. The derived gains are further slightly adjusted using cross-calibrations against ocean color products derived from OLI [59] and MODIS [48][49][50].

Atmospheric Correction
Similar to the implementations for Landsat-8 and Sentinel-2, per-pixel geometry data, i.e., view zenith/azimuth angles and solar zenith/azimuth angles, were utilized in the processing. The current methodology to correct for L w (N IR) = 0 is based on a bio-optical model, which uses R rs (443) data [54]. More specifically, this model utilizes r rs (443)/r rs (561) to estimate the spectral slope (η) of particulate backscattering (b bp ), which is applied in the retrieval of R rs (NIR), where r rs is the subsurface remote sensing reflectance [60]. The empirical method for estimating η (referred to as η 443/561 ) is as follows The TM and ETM+ spectral band centers are tabulated in Table 1. Throughout this manuscript, we refer to their visible bands as blue and green bands unless specific bands for each instrument are discussed. Since the 443 nm channel is not available for TM and ETM+, we developed a slightly different strategy. To do so, in situ radiometric data collected over various inland waters [61] were used to examine the correlations between [r rs (443)/r rs (561)] and [r rs (482)/r rs (561)] for OLI. We found a strong relationship between the two ratios ( Figure 2 (left)). Hence, the ratio in Equation (1)  subsurface remote sensing reflectance [60]. The empirical method for estimating (referred to as / ) is as follows The TM and ETM+ spectral band centers are tabulated in Table 1. Throughout this manuscript, we refer to their visible bands as blue and green bands unless specific bands for each instrument are discussed. Since the 443 nm channel is not available for TM and ETM+, we developed a slightly different strategy. To do so, in situ radiometric data collected over various inland waters [61] were used to examine the correlations between (443) (561) ⁄ and (482) (561) ⁄ for OLI. We found a strong relationship between the two ratios ( Figure 2 (left)). Hence, the ratio in Equation (1) The performance of the newly estimated / was tested against / in Figure 2 (right). Similar relationships were then built for TM and ETM+. This assessment further suggests that Equation (2) can be utilized to estimate (NIR) in an iterative fashion for processing TM and ETM+ as implemented for OLI.

Vicarious Calibration
In this exercise, the TM and ETM+ data were vicariously calibrated using different reference data to reduce systematic biases in the three visible bands (Table 1). Theoretically, all the Earth-observing satellites passing over one of the existing vicarious calibration systems, such as the Marine Optical Buoy (MOBY) [62] or BOUSSOLE [63], can be vicariously calibrated. The ETM+ data, with regular image acquisitions over both sites, can thus be vicariously calibrated. However, due to the limitations in the downlink capacity of Landsat-5, TM images were acquired neither over the MOBY nor over the BOUSSOLE site, which became operational in 2003. Hence, we employed MODIS data to vicariously calibrate, i.e., cross-calibrate, TM data. The strong relationship (left) between the band ratios of subsurface remote sensing reflectance in the blue and green channels, i.e., r rs (482/561) and r rs (443/561). The relationship follows the power-law function (A = 0.7554, B = 1.3994) and provides a high correlation coefficient, i.e., R 2 = 0.987. The scatterplot (right) shows the cross-validation of η derived using two different band ratios, i.e., r rs (443 /561) and r rs (482/561). The performance of the newly estimated η 482/561 was tested against η 443/561 in Figure 2 (right). Similar relationships were then built for TM and ETM+. This assessment further suggests that Equation (2) can be utilized to estimate R rs (NIR) in an iterative fashion for processing TM and ETM+ as implemented for OLI.

Vicarious Calibration
In this exercise, the TM and ETM+ data were vicariously calibrated using different reference data to reduce systematic biases in the three visible bands (Table 1). Theoretically, all the Earth-observing satellites passing over one of the existing vicarious calibration systems, such as the Marine Optical Buoy (MOBY) [62] or BOUSSOLE [63], can be vicariously calibrated. The ETM+ data, with regular image acquisitions over both sites, can thus be vicariously calibrated. However, due to the limitations in the downlink capacity of Landsat-5, TM images were acquired neither over the MOBY nor over the BOUSSOLE site, which became operational in 2003. Hence, we employed MODIS data to vicariously calibrate, i.e., cross-calibrate, TM data.

Landsat-7 (ETM+)
The time-series of all USGS ETM+ (Collection-1) images over the MOBY and BOUSSOLE sites were obtained and processed using SeaDAS to compute the vicarious gains [56]. The calibration is performed in two steps. In Step I, the NIR and the Short-wave Infrared (SWIR) bands are calibrated. This is carried out by selecting scenes with low aerosol loading, i.e., aerosol optical thicknesses in the NIR (AOT(835)) < 0.15. The goal is to minimize propagations of systematic errors (e.g., calibration biases) in aerosol path radiance estimates for these channels when generating R rs products [64]. With the calibration gains derived for these bands, the vicarious gains for the three visible bands were derived in Step II. Overall, 58 cloud-free images over the MOBY (n = 30) and BOUSSOLE (n = 28) sites were utilized in the analyses. While MOBY hyperspectral L w resampled to ETM+ multispectral bands were used, the BOUSSOLE narrow-band, multispectral L w were converted to ETM+ multispectral bands using a trained deep neural network [61]. The vicarious gains were initially produced using three different band combinations to account for aerosol contributions. The band combinations for ETM+ were 835-1648, 835-2205, and 1648-2205 (Table 1).
Similar to the results in Reference [59], preliminary analyses of the R rs retrievals at AERONET-OC [65] sites indicated that the most robust matchups were produced with 835 and 1648 nm bands (Table 1); therefore, we only present results derived with this band pair. The median gains within 5 × 5-element windows centered over the exact coordinates are utilized to represent spectral gains for each ETM+ image. The median statistics were chosen to suppress image artifacts, including striping. The median statistics were found to be nearly identical (∼ 10 −4 ) to the average computed within the interquartile range [66]. Furthermore, a sensitivity analysis on the window size, i.e., deriving gains for 3 × 3-, 7 × 7-, 9 × 9-, and 11×11-element kernels, indicated minimal differences, i.e., <10 −3 , in the derived gains as a function of the window size. The time-series of vicarious gains for the visible bands derived using the 835-1648 combination are shown in Figure 3. The median gains and one standard deviation (in round brackets) are reported in Table 2. The MOBY and BOUSSOLE median gains in the blue, green, and red channels were 1.033, 1.011, 1.021 and 1.027, 1.009, 1.033, respectively. The gains listed in Table 2 are the average of the two. The relatively large difference in the red channel can be attributed to uncertainties in the spectral band adjustments for the multispectral field observations at the BOUSSOLE site. While the standard deviations (σ = 2.3%) were found comparable to the gains derived from BOUSSOLE data for SeaWiFS, they are twice as large as those derived from MOBY data [66]. The high uncertainties suggest that there may be residual errors in the gains that can be identified and accounted for using further cross-calibration (see Section 2.3). Therefore, we refer to these gains as preliminary. Radiometric artifacts (e.g., detector striping) and low SNRs in the NIR and SWIR bands contribute to large uncertainties in the gains [64], as do environmental factors like cirrus clouds or surface waves.

Landsat-5 (TM)
Due to the absence of TM images over the calibration sites, we used normalized ( ) products derived from the MODIS observations collected onboard both Aqua and Terra platforms since 1999. For this exercise, near-concurrent (defined as image-pair collections on the same day) TM and MODIS images over a number of preselected locations were analyzed. The selected locations were those considered favorable sites for potential future vicarious calibration systems, due to factors such as frequency of cloud cover and temporal and spatial stability of atmosphere and water optical properties [67]. In effect, we cross-calibrated TM data with MODIS products at these stable locations. First, the cloud-and glint-free USGS TM (Collection-1) images were visually identified. The corresponding Level-2 MODIS products were obtained from NASA's web portal: https://oceancolor.gsfc.nasa.gov/. Note that these standard products have undergone very conservative pre-processing procedures. Areas with low Chl, i.e., a median of 0.25 mg m ⁄ , AOT(869) < 0.15, and within 60° viewing zenith angles, were selected to perform the vicarious calibration. These products are assumed to carry minimal uncertainties in the atmospheric correction process [66]. The median Rrs spectra were extracted from MODIS 3 × 3-element windows. The sites include West Australia (n = 60), US West Coast (n = 65), São Miguel Island (n = 25), and the Northern Mediterranean Sea (n = 46). Note that the Mediterranean site is centered around the existing BOUSSOLE site. The TM data over the other suggested calibration sites (e.g., Puerto Rico) were scarce and thus not used. A total of 196 matchups were identified for this vicarious calibration exercise. The MODIS-derived were then adjusted for differences in spectral bands to create TM-equivalent MODIS data ( ) [61]. The TM images were processed in the vicarious calibration mode in SeaDAS by supplying spectra. Given MODIS viewing zenith angles, TM image products were spatially sampled such that they accurately represent the extent "viewed" by MODIS. Similar to the ETM+ processing, the gain calculations were carried out in two steps (see Section 2.1.1) and we only present gains derived from the NIR and SWIR-I, i.e., 840-1676. Here, we assume a black ocean in these two bands, which is valid for these relatively clear waters. Figure 4 illustrates time-series of gains derived for the TM data. It should also be noted that Terra-MODIS data collected only prior to 2003 were used in the vicarious calibration [68]. The derived gains are color-coded differently for the four sites. Table 2 contains median gains and (one) standard deviations. Similar to the ETM+ gain derivations, the  Table 2. Note that the gains are derived after vicariously calibrating NIR and SWIR-I bands in Step I.

Landsat-5 (TM)
Due to the absence of TM images over the calibration sites, we used normalized L w (nL w ) products derived from the MODIS observations collected onboard both Aqua and Terra platforms since 1999. For this exercise, near-concurrent (defined as image-pair collections on the same day) TM and MODIS images over a number of preselected locations were analyzed. The selected locations were those considered favorable sites for potential future vicarious calibration systems, due to factors such as frequency of cloud cover and temporal and spatial stability of atmosphere and water optical properties [67]. In effect, we cross-calibrated TM data with MODIS products at these stable locations. First, the cloud-and glint-free USGS TM (Collection-1) images were visually identified. The corresponding Level-2 MODIS products were obtained from NASA's web portal: https://oceancolor.gsfc.nasa.gov/. Note that these standard products have undergone very conservative pre-processing procedures. Areas with low Chl, i.e., a median of 0.25 mg/m 3 , AOT(869) < 0.15, and within ±60 • viewing zenith angles, were selected to perform the vicarious calibration. These products are assumed to carry minimal uncertainties in the atmospheric correction process [66]. The median R rs spectra were extracted from MODIS 3 × 3-element windows. The sites include West Australia (n = 60), US West Coast (n = 65), São Miguel Island (n = 25), and the Northern Mediterranean Sea (n = 46). Note that the Mediterranean site is centered around the existing BOUSSOLE site. The TM data over the other suggested calibration sites (e.g., Puerto Rico) were scarce and thus not used. A total of 196 matchups were identified for this vicarious calibration exercise. The MODIS-derived R rs were then adjusted for differences in spectral bands to create TM-equivalent MODIS data (R MOD−TM rs ) [61]. The TM images were processed in the vicarious calibration mode in SeaDAS by supplying R MOD−TM rs spectra. Given MODIS viewing zenith angles, TM image products were spatially sampled such that they accurately represent the extent "viewed" by MODIS. Similar to the ETM+ processing, the gain calculations were carried out in two steps (see Section 2.2.1) and we only present gains derived from the NIR and SWIR-I, i.e., 840-1676. Here, we assume a black ocean in these two bands, which is valid for these relatively clear waters. Figure 4 illustrates time-series of gains derived for the TM data. It should also be noted that Terra-MODIS data collected only prior to 2003 were used in the vicarious calibration [68]. The derived gains are color-coded differently for the four sites. Table 2 contains median gains and (one) standard deviations. Similar to the ETM+ gain derivations, the median statistics were Remote Sens. 2018, 10, 1337 7 of 23 very similar to the mean value calculated within the interquartile range. It can be inferred that the TM radiometric responses in the visible bands are biased low. Analyses of gains computed at each site indicated noticeable differences in the blue gains, i.e., 1.031, 1.046, 1.054, and 1.056 for the four sites. Furthermore, TM images contain more systematic errors (e.g., nonlinearity in instrument response) than those of ETM+. Uncertainty in spectral band adjustments is another source of error in gain computations. Nevertheless, the vicarious gains should largely minimize the biases in the long-term Landsat-5 data record (see Section 3).
Remote Sens. 2018, 10, x FOR PEER REVIEW 7 of 23 median statistics were very similar to the mean value calculated within the interquartile range. It can be inferred that the TM radiometric responses in the visible bands are biased low. Analyses of gains computed at each site indicated noticeable differences in the blue gains, i.e., 1.031, 1.046, 1.054, and 1.056 for the four sites. Furthermore, TM images contain more systematic errors (e.g., nonlinearity in instrument response) than those of ETM+. Uncertainty in spectral band adjustments is another source of error in gain computations. Nevertheless, the vicarious gains should largely minimize the biases in the long-term Landsat-5 data record (see Section 3).  Table 2. Note that the gains are derived after vicariously calibrating NIR and SWIR-I bands.

Gain Adjustments
Considering the uncertainties (i.e., instrument artifacts and environmental noise), there remain residual biases in both TM and ETM+ gains. Therefore, these gains only provide initial estimates for the biases and slight adjustments may be inevitable. In this section, we will further examine the robustness of these gains by evaluating TM-and ETM+-derived products against those measured/derived from AERONET-OC and OLI-derived products and fine-tune them as needed. Because of their orbit configurations, Landsat-5 and -7 and Landsat-7 and -8 do not create opportunities for direct intercomparisons (of TOA products) for which near-simultaneous overpasses are necessary [69][70][71]. Inherently, the satellite pairs observe a target at 8-day intervals. It is thus imperative to choose sites with optically stable waters within this time window and assess the products (Level-2). We chose Crater Lake and Lake Tahoe representing oligotrophic and mesotrophic water bodies, respectively [69]. Here, the OLI data products (in operation since 2013 and in good agreement with MODIS [59]) are regarded as references for evaluating ETM+ products, which are then utilized to evaluate TM products within the 1999-2012 period. These sites, however, allow only for intercomparisons over a limited signal range in the blue, i.e., 0.002 < Rrs (485) < 0.008 [1/sr], and even smaller in the green and red channels. To extend that range, we also performed intercomparisons at AERONET-OC sites. Note that in both analyses, uncertainties in (a) the atmospheric correction and (b) OLI products are unavoidable.  Table 2. Note that the gains are derived after vicariously calibrating NIR and SWIR-I bands.

Gain Adjustments
Considering the uncertainties (i.e., instrument artifacts and environmental noise), there remain residual biases in both TM and ETM+ gains. Therefore, these gains only provide initial estimates for the biases and slight adjustments may be inevitable. In this section, we will further examine the robustness of these gains by evaluating TM-and ETM+-derived R rs products against those measured/derived from AERONET-OC and OLI-derived R rs products and fine-tune them as needed. Because of their orbit configurations, Landsat-5 and -7 and Landsat-7 and -8 do not create opportunities for direct intercomparisons (of TOA products) for which near-simultaneous overpasses are necessary [69][70][71]. Inherently, the satellite pairs observe a target at 8-day intervals. It is thus imperative to choose sites with optically stable waters within this time window and assess the R rs products (Level-2). We chose Crater Lake and Lake Tahoe representing oligotrophic and mesotrophic water bodies, respectively [69]. Here, the OLI data products (in operation since 2013 and in good agreement with MODIS [59]) are regarded as references for evaluating ETM+ products, which are then utilized to evaluate TM products within the 1999-2012 period. These sites, however, allow only for intercomparisons over a limited signal range in the blue, i.e., 0.002 < R rs (485) < 0.008 [1/sr], and even smaller in the green and red channels. To extend that range, we also performed intercomparisons at AERONET-OC sites. Note that in both analyses, uncertainties in (a) the atmospheric correction and (b) OLI products are unavoidable.
To begin, TM and ETM+ data over these sites (i.e., all available matchups at AERONET-OC stations, Crater and Tahoe lakes) were processed using the computed vicarious gains (Table 2). Initial analyses of the data revealed that the TM-and ETM+-derived R rs products are slightly biased low (i.e., <0.0016 [1/sr]) in the blue and red channels. The difference was found largest in the TM's blue band. As an example, Figure 5 illustrates the corresponding scatterplots for the OLI-ETM+ and TM-ETM+ intercomparisons in the blue bands. The root-mean-square differences (RMSD), the median relative differences (MRD), and median biases (Bias) for the OLI-ETM+ matchups can be computed to gauge the relative differences: The RMSD, Bias, and the slope for the linear fit are reported in Figure 5. After running a few experiments to minimize the differences in TM and ETM+ products with respects to those of AERONET-OC and OLI, final vicarious adjustments were determined. Table 3 contains these final gains. It should be noted that due to large uncertainties in the gain computations for the SWIR-I band (see Table 2), the largest adjustments were made in these bands to compensate for potential residual differences/biases in the visible bands [64]. The goodness of these final gains is independently tested at Level-2 products against those of Aqua-MODIS (MODISA) and Terra-MODIS (MODIST) over a handful of sites across North America (see Section 3.2.2).
Remote Sens. 2018, 10, x FOR PEER REVIEW 8 of 23 To begin, TM and ETM+ data over these sites (i.e., all available matchups at AERONET-OC stations, Crater and Tahoe lakes) were processed using the computed vicarious gains (Table 2). Initial analyses of the data revealed that the TM-and ETM+-derived products are slightly biased low (i.e., <0.0016 [1/sr]) in the blue and red channels. The difference was found largest in the TM's blue band. As an example, Figure 5 illustrates the corresponding scatterplots for the OLI-ETM+ and TM-ETM+ intercomparisons in the blue bands. The root-mean-square differences (RMSD), the median relative differences (MRD), and median biases (Bias) for the OLI-ETM+ matchups can be computed to gauge the relative differences: The RMSD, Bias, and the slope for the linear fit are reported in Figure 5. After running a few experiments to minimize the differences in TM and ETM+ products with respects to those of AERONET-OC and OLI, final vicarious adjustments were determined. Table 3 contains these final gains. It should be noted that due to large uncertainties in the gain computations for the SWIR-I band (see Table 2), the largest adjustments were made in these bands to compensate for potential residual differences/biases in the visible bands [64]. The goodness of these final gains is independently tested at Level-2 products against those of Aqua-MODIS (MODISA) and Terra-MODIS (MODIST) over a handful of sites across North America (see Section 3.2.2).  Table 2 were supplied to SeaDAS to generate these products. Final gains are tabulated in Table 3. Note that the image pairs were acquired 8 days apart.   Table 2 were supplied to SeaDAS to generate these products. Final gains are tabulated in Table 3. Note that the image pairs were acquired 8 days apart.

Results
After implementing the vicarious gains (Table 3), the generated R rs products are first evaluated for qualitative and quantitative rigor followed by analyzing the time-series products.

Qualitative Assessment of R rs Products
Two examples of products processed via SeaDAS are illustrated in Figures 6 and 7. The goal is to qualitatively demonstrate the TM and ETM+ products and visually compare them to those obtained from OLI. Note that these products are associated with different seasons/times; therefore, this analysis is not suitable to gauge consistency, which will be discussed in Sections 3.2.2 and 3.3. Note that a 7 × 7-element (uniform) averaging filter is applied in the aerosol removal process. Figure 6 illustrates the R rs products derived from TM (left), ETM+ (middle), and OLI (right) over the turbid and shallow northern San Francisco Bay area (i.e., San Pablo Bay). The natural variability (e.g., gradients in optical regimes) in water-column is clearly captured in TM and ETM+ images. Small eddies, local resuspensions, and upwelling events can visually be discerned in the TM-and ETM+-derived R rs products. Nevertheless, artifacts are pronounced in TM and ETM+ products when compared to those of OLI. The artifacts in TM products (in particular in the blue band) are due to memory effect, coherent noise, and striping [13]. Due to OLI's higher SNR and absence of major artifacts, the quality of OLI products is noticeably better than those of TM and ETM+. The scan-line corrector issue, i.e., SLC-off, [3] associated with ETM+ retrievals are masked, i.e., white stripes. Furthermore, it should be noted that, although TM and ETM+ measure spectral radiances in three broad bands, these bands capture relatively distinct spectral signatures of near-surface constituents and their optical properties.

Results
After implementing the vicarious gains (Table 3), the generated products are first evaluated for qualitative and quantitative rigor followed by analyzing the time-series products.

Qualitative Assessment of Rrs Products
Two examples of products processed via SeaDAS are illustrated in Figures 6 and 7. The goal is to qualitatively demonstrate the TM and ETM+ products and visually compare them to those obtained from OLI. Note that these products are associated with different seasons/times; therefore, this analysis is not suitable to gauge consistency, which will be discussed in Sections 3.2.2 and 3.3. Note that a 7 × 7-element (uniform) averaging filter is applied in the aerosol removal process. Figure  6 illustrates the products derived from TM (left), ETM+ (middle), and OLI (right) over the turbid and shallow northern San Francisco Bay area (i.e., San Pablo Bay). The natural variability (e.g., gradients in optical regimes) in water-column is clearly captured in TM and ETM+ images. Small eddies, local resuspensions, and upwelling events can visually be discerned in the TM-and ETM+derived products. Nevertheless, artifacts are pronounced in TM and ETM+ products when compared to those of OLI. The artifacts in TM products (in particular in the blue band) are due to memory effect, coherent noise, and striping [13]. Due to OLI's higher SNR and absence of major artifacts, the quality of OLI products is noticeably better than those of TM and ETM+. The scan-line corrector issue, i.e., SLC-off, [3] associated with ETM+ retrievals are masked, i.e., white stripes. Furthermore, it should be noted that, although TM and ETM+ measure spectral radiances in three broad bands, these bands capture relatively distinct spectral signatures of near-surface constituents and their optical properties.  Another demonstration of the natural variability in inland and nearshore coastal waters captured in R rs (569) and R rs (560) products derived from TM and ETM+ is illustrated in Figure 7. The images were acquired eight days apart in September 2006 over the eastern North Carolina inland waters and shorelines. The scan-line corrector issues (i.e., SLC-off) associated with ETM+ retrievals are masked (shown with white stripes). The TM and ETM+ clearly reveal various water types, such as relatively clear waters in the coastal shelf area, nearshore turbid waters, and CDOM-rich river waters denoted with arrows, i.e., Tar/Pamlico River. Resuspension events can also be inferred in shallower waters. These demonstrations suggest the remarkable value of archived Landsat data and their utility enabled via SeaDAS. The negative retrievals, i.e., failure of the atmospheric correction, in CDOM-rich river waters are a known problem requiring further study and investigations into the future.
Another demonstration of the natural variability in inland and nearshore coastal waters captured in Rrs(569) and Rrs(560) products derived from TM and ETM+ is illustrated in Figure 7. The images were acquired eight days apart in September 2006 over the eastern North Carolina inland waters and shorelines. The scan-line corrector issues (i.e., SLC-off) associated with ETM+ retrievals are masked (shown with white stripes). The TM and ETM+ clearly reveal various water types, such as relatively clear waters in the coastal shelf area, nearshore turbid waters, and CDOM-rich river waters denoted with arrows, i.e., Tar/Pamlico River. Resuspension events can also be inferred in shallower waters. These demonstrations suggest the remarkable value of archived Landsat data and their utility enabled via SeaDAS. The negative retrievals, i.e., failure of the atmospheric correction, in CDOM-rich river waters are a known problem requiring further study and investigations into the future.

Validation
For a quantitative assessment of heritage Landsat products, we utilized two data sources; (a) AERONET-OC data and (b) Rrs products obtained from MODIS. This was to provide as much evidence as possible to ensure that these products were, on average, in reasonable agreement with standard products.

AERONET-OC
Since the early 2000s, the AERONET-OC data (https://aeronet.gsfc.nasa.gov) have been used as standard datasets to evaluate the quality of ocean color data products. Although the sites do not represent global in-water conditions, they provide adequate and easily accessible matchups to evaluate the performance of both atmospheric correction and instrument's radiometric characteristics. The scatterplots for ETM+ and TM data for all the bands are shown in Figure 8. For the matchups, hazy images and areas affected by partial sunglint were removed from the analyses. A total of 216 and 79 matchups were found for ETM+ and TM data, respectively. For TM and ETM+, >90% and >65% of the matchups correspond to moderately turbid waters of the Venise (located in Northern Adriatic Sea) and Martha's Vineyard (situated in southern coastal waters of Massachusetts) sites. The statistics were generated in a similar fashion as in Equations (3)-(5).

Validation
For a quantitative assessment of heritage Landsat products, we utilized two data sources; (a) AERONET-OC data and (b) R rs products obtained from MODIS. This was to provide as much evidence as possible to ensure that these products were, on average, in reasonable agreement with standard products.

AERONET-OC
Since the early 2000s, the AERONET-OC data (https://aeronet.gsfc.nasa.gov) have been used as standard datasets to evaluate the quality of ocean color data products. Although the sites do not represent global in-water conditions, they provide adequate and easily accessible matchups to evaluate the performance of both atmospheric correction and instrument's radiometric characteristics. The scatterplots for ETM+ and TM data for all the bands are shown in Figure 8. For the matchups, hazy images and areas affected by partial sunglint were removed from the analyses. A total of 216 and 79 matchups were found for ETM+ and TM data, respectively. For TM and ETM+, >90% and >65% of the matchups correspond to moderately turbid waters of the Venise (located in Northern Adriatic Sea) and Martha's Vineyard (situated in southern coastal waters of Massachusetts) sites. The statistics were generated in a similar fashion as in Equations (3)  The computed statistics after implementing gains (Table 3) are provided in Table 4. For reference, MRD and root-mean-square error (RMSE) computed with unity gains, i.e., prior to implementing the gains, are given in round brackets. Note that RMSE was computed as in Eq. 3; yet, here, we treat the AERONET-OC data as reference standards. Comparing MRD and RMSE statistics indicate that the vicarious gains, on average, have significantly reduced median biases in products (in particular for TM); thereby enhancing the overall consistency in long-term Landsat-derived products. The slope and intercepts for TM matchups allude to potential non-linearity at low radiance levels. Collectively, the statistics suggest that TM products should be handled with care when accurate/precise products (e.g., Chl) are desired. The best performances are, in general, found in the green channels. The relatively large slopes in ETM+ analyses implicate potential nonlinearity in instrument responses. We further computed the median spectra, i.e., , from AERONET-OC data to estimate percent RMSE ( / ) in retrievals. The for ETM+ and TM data were 0.0049, 0.00519, 0.00146 [1/sr] and 0.0050, 0.00371, 0.0013 [1/sr], respectively. The percent RMSEs were estimated as 36%, 28%, 62%, and 46%, 35%, 84% for ETM+ and TM's blue, green, and red bands, respectively. Although the quality of ETM+ are adequately evaluated here using AERONET-OC data, there is still a need to better assess TM products over a broader range of water bodies. Table 4. The statistics associated with matchup analyses of Landsat-7 and -5 at AERONET-OC sites. The values in round brackets refer to statistics derived prior to implementing the gains listed in Table  3. The large RMSEs and deviations of slopes from unity are associated with random and systematic errors in TM and ETM+ observations. The Landsat-5 matchups are available only for the last eight years of the mission. The computed statistics after implementing gains (Table 3) are provided in Table 4. For reference, MRD and root-mean-square error (RMSE) computed with unity gains, i.e., prior to implementing the gains, are given in round brackets. Note that RMSE was computed as in Equation (3); yet, here, we treat the AERONET-OC data as reference standards. Comparing MRD and RMSE statistics indicate that the vicarious gains, on average, have significantly reduced median biases in products (in particular for TM); thereby enhancing the overall consistency in long-term Landsat-derived products. The slope and intercepts for TM matchups allude to potential non-linearity at low radiance levels. Collectively, the statistics suggest that TM products should be handled with care when accurate/precise products (e.g., Chl) are desired. The best performances are, in general, found in the green channels. The relatively large slopes in ETM+ analyses implicate potential nonlinearity in instrument responses. We further computed the median R rs spectra, i.e., R rs , from AERONET-OC data to estimate percent RMSE (RMSE/R rs ) in retrievals. The R rs for ETM+ and TM data were 0.0049, 0.00519, 0.00146 [1/sr] and 0.0050, 0.00371, 0.0013 [1/sr], respectively. The percent RMSEs were estimated as 36%, 28%, 62%, and 46%, 35%, 84% for ETM+ and TM's blue, green, and red bands, respectively. Although the quality of ETM+ are adequately evaluated here using AERONET-OC data, there is still a need to better assess TM products over a broader range of water bodies. Table 4. The statistics associated with matchup analyses of Landsat-7 and -5 at AERONET-OC sites. The values in round brackets refer to statistics derived prior to implementing the gains listed in Table 3. The large RMSEs and deviations of slopes from unity are associated with random and systematic errors in TM and ETM+ observations. The Landsat-5 matchups are available only for the last eight years of the mission.

MODIS-Derived R rs Products
Since the AERONET-OC data do not provide a full representation of aquatic conditions in inland and nearshore coastal waters, we further evaluated TM/ETM+ products against (same-day) MODISA and MODIST products over the central basin of Lake Mead (n = 100/101), San Francisco (SF) Bay (n = 63/75), San Pablo Bay (n = 66/69), and the eastern basin of Lake Erie (n = 19/29) in North America for which a rich archive of TM data is available (limitations in the data downlink during periods of TM operations preclude a robust global analysis). The number of matchups (in round brackets) refer to the number of MODIST-ETM+ and MODISA-TM matchups, respectively. These sites collectively enable a fair assessment of TM and ETM+ products over a wider range of aquatic conditions (e.g., areas with high CDOM and/or sediment loads) and are sufficiently spatially extensive for valid ocean color products from MODIS observations. Landsat-5, Landsat-7, and Terra were/are all morning satellites, while Aqua is in an afternoon orbit. Our initial assessments indicated noticeable uncertainties in intercomparisons due to the large time differences between ETM+ and MODISA overpasses. This was, in particular, observed over SF Bay influenced by significant tidal forcing. Therefore, for cross-validating ETM+ products, MODIST products were applied. In addition, because the two missions are in very similar orbits, near-nadir MODIST observations/products are available for cross-validating ETM+ products. On the other hand, Landsat-5 and Terra's orbits are different such that only MODIST observations at the edge of the swaths are available for valid, same-day intercomparisons with those of TM. Such products are represented with large footprint sizes; therefore, larger uncertainties in products are expected. It was then decided to utilize MODISA products for investigating TM products, recognizing that there is two-to-three-hour time gap between data collections, which increase uncertainties in intercomparisons. The Level-1B MODIS data were processed to Level-2 products using a NIR-SWIR band combination [73]. The rest of the parameter-setting was identical to the current NASA Ocean Biology Processing Group's standard processing. The spatial sampling for Aqua-MODIS R rs products were conducted by taking the median values within 3×3-element windows to avoid contaminations from land. The equivalent sample area for Landsat products (i.e., median within 3 × 3 km regions) were adjusted according to MODIS viewing geometries. Further, the differences in the spectral bands were accounted for [61]. Figure 9 shows the scatterplots with the statistical analyses tabulated in Table 5.
Overall, there is a reasonable correlation in MODIST-ETM+ intercomparisons. The scattered matchups correspond to artifacts in TM and ETM+ products, uncertainties in the atmospheric correction, the spectral band adjustments, and possible changes in water-column conditions (in particular for TM-MODISA intercomparisons over SF Bay and San Pablo Bay influenced by strong tides). The uncertainties due to the time difference in overpasses were also found over Lake Mead and Lake Erie when both MODIST-ETM+ and MODISA-ETM+ intercomparisons were examined. Further site-specific analyses suggested larger uncertainties in TM-derived blue and red bands in Lake Mead and Lake Erie where R rs (485) and R rs (660) were, on average, 0.0034 [1/sr] and 0.0022 [1/sr], respectively.

Time-Series Applications
In order to examine the long-term record of Landsat-derived products, we analyzed seasonal composites generated for 12 aquatic systems across the continental USA. The seasonal composite plots were derived from locally processed Landsat data over 12 sites, three of which are presented in this manuscript. The median computed in each season represents seasonal composite products, which were produced to minimize noise in the analyses. The three plots shown in Figures 10-12 (Table 1). Figure 10 shows these products for the IRL in the vicinity of Sebastian Inlet State Park (FL).
Our approach developed for compensating for non-zero Lw(NIR) (Section 2.1) significantly increased the number of valid retrievals, i.e., more than 40% of the retrievals were negative in Rrs(485) products. Since data were collected eight days apart, potentially yielding biases in seasonal composites, it is difficult to infer consistency in the data products. In addition, seawater-intrusion and tidal exchanges may complicate such analyses. Seasonal composites over Pamlico Sound, North Carolina, are illustrated in Figure 11. Some of the retrievals may be associated with the noise in the NIR and SWIR-I bands that propagate through the atmospheric correction [64]. Note that prior to accounting for Lw(NIR) the entire record in the blue band was invalid. The time-series plot for the San Joaquin River (California) is shown in Figure 12 For this riverine system dominated with scattering particles, Lw(NIR) should be accounted for to allow for valid and/or accurate retrievals [54]. Similar results were found for a downstream system, i.e., Suisan Bay. The impact of the proposed methodology was most noticeable in such CDOM-dominated waters, i.e., frequent negative retrievals in the blue and green were observed prior to its implementation.

Time-Series Applications
In order to examine the long-term record of Landsat-derived R rs products, we analyzed seasonal composites generated for 12 aquatic systems across the continental USA. The seasonal composite plots were derived from locally processed Landsat data over 12 sites, three of which are presented in this manuscript. The median R rs computed in each season represents seasonal composite products, which were produced to minimize noise in the analyses. The three plots shown in Figures  , Onondaga Lake (NY), Lake Tahoe (CA), and Crater Lake. The only products evaluated were R rs for the three visible bands (Table 1). Figure 10 shows these products for the IRL in the vicinity of Sebastian Inlet State Park (FL).
Our approach developed for compensating for non-zero L w (NIR) (Section 2.1) significantly increased the number of valid retrievals, i.e., more than 40% of the retrievals were negative in R rs (485) products. Since data were collected eight days apart, potentially yielding biases in seasonal composites, it is difficult to infer consistency in the data products. In addition, seawater-intrusion and tidal exchanges may complicate such analyses. Seasonal R rs composites over Pamlico Sound, North Carolina, are illustrated in Figure 11. Some of the retrievals may be associated with the noise in the NIR and SWIR-I bands that propagate through the atmospheric correction [64]. Note that prior to accounting for L w (NIR) the entire record in the blue band was invalid. The time-series plot for the San Joaquin River (California) is shown in Figure 12 For this riverine system dominated with scattering particles, L w (NIR) should be accounted for to allow for valid and/or accurate retrievals [54]. Similar results were found for a downstream system, i.e., Suisan Bay. The impact of the proposed methodology was most noticeable in such CDOM-dominated waters, i.e., frequent negative retrievals in the blue and green were observed prior to its implementation.  The composites were obtained by taking the median of retrievals in each season. The images processed via SeaDAS. The product consistency can be inferred in the overlapping periods indicating the utility of long-term Landsat data record for aquatic science studies. Note that because of the differences in the temporal sampling (i.e., eight days) and considering various tidal stages, a robust analysis of product consistency is not possible.  The composites were obtained by taking the median of retrievals in each season. The images processed via SeaDAS. The product consistency can be inferred in the overlapping periods indicating the utility of long-term Landsat data record for aquatic science studies. Note that because of the differences in the temporal sampling (i.e., eight days) and considering various tidal stages, a robust analysis of product consistency is not possible.

Discussions: Product Quality
The analyses thus far have highlighted some of the potentials and limitations of Landsat archive for long-term aquatic science studies. Now, with a rigorous, physics-based processing system in place, the science community can further explore the utility of the Landsat data record in science studies, such as examining impacts of human-induced activities (e.g., landuse/landcover change) on water quality. We demonstrated that following the cross-calibration and vicarious calibration of TM and ETM+ data, the biases in Rrs products are significantly reduced. In this section, we will further elaborate on the limitations of TM and ETM+ Rrs products with regard to noise contributions and processing flaws. A brief discussion on adjacency effects is also provided.

Signal-to-Noise Ratio (SNR)
Although we showed that TM-and ETM+-derived Rrs products agree well, on average, with those of OLI [59], OLI's SNR is three-to-four times higher than those of TM and ETM+ [5,12]. To better understand the implications of these differences in Rrs products, a brief sensitivity analysis is carried out. Here, a small subset (10 × 10) of a cloud-free OLI image (ID:LC80220402014227LGN00, solar zenith angle of 26.3°) with low aerosol load, i.e., AOT(865) = 0.08, and over moderately turbid waters (Table 6) was processed by introducing various noise levels to simulate TM, ETM+, and OLI performances [5,12]. For each sensor, noise ( ) is estimated via ⁄ . The median and (derived from OLI subset) are tabulated in Table 6. The SNRs are adopted from References [5,74].
Note that OLI's SNR from [74] were scaled to match the SNRs derived from typical radiances reported in Reference [5] for TM, ETM+, and MODIS. The analyses were conducted via Monte Carlo simulations for all the bands (Table 1) except the SWIR-II band, which was not used in the aerosol removal. To provide a reference, the OLI subset was also processed with the MODIS noise performance [5] representing a high-fidelity ocean color sensor. To generate various scenarios, were allowed to vary within the range with having a normal distribution. All the permutations resulted in ~7000 SeaDAS runs, which were repeated four times for TM, ETM+, OLI, and MODIS. The product consistency can be inferred in the cross-mission overlapping periods. Few negative retrievals are attributed to imperfect corrections for the contributing water-leaving radiance in the NIR channel.

Discussions: Product Quality
The analyses thus far have highlighted some of the potentials and limitations of Landsat archive for long-term aquatic science studies. Now, with a rigorous, physics-based processing system in place, the science community can further explore the utility of the Landsat data record in science studies, such as examining impacts of human-induced activities (e.g., landuse/landcover change) on water quality. We demonstrated that following the cross-calibration and vicarious calibration of TM and ETM+ data, the biases in R rs products are significantly reduced. In this section, we will further elaborate on the limitations of TM and ETM+ R rs products with regard to noise contributions and processing flaws. A brief discussion on adjacency effects is also provided.

Signal-to-Noise Ratio (SNR)
Although we showed that TM-and ETM+-derived R rs products agree well, on average, with those of OLI [59], OLI's SNR is three-to-four times higher than those of TM and ETM+ [5,12]. To better understand the implications of these differences in R rs products, a brief sensitivity analysis is carried out. Here, a small subset (10 × 10) of a cloud-free OLI image (ID:LC80220402014227LGN00, solar zenith angle of 26.3 • ) with low aerosol load, i.e., AOT(865) = 0.08, and over moderately turbid waters (Table 6) was processed by introducing various noise levels to simulate TM, ETM+, and OLI performances [5,12]. For each sensor, noise (n s ) is estimated via L t /SNR. The median L t and R rs (derived from OLI subset) are tabulated in Table 6. The SNRs are adopted from References [5,74].
Note that OLI's SNR from [74] were scaled to match the SNRs derived from typical radiances reported in Reference [5] for TM, ETM+, and MODIS. The analyses were conducted via Monte Carlo simulations for all the bands (Table 1) except the SWIR-II band, which was not used in the aerosol removal. To provide a reference, the OLI subset was also processed with the MODIS noise performance [5] representing a high-fidelity ocean color sensor. To generate various scenarios, L t were allowed to vary within the L t ± n s range with n s having a normal distribution. All the permutations resulted in~7000 SeaDAS runs, which were repeated four times for TM, ETM+, OLI, and MODIS.
The uncertainties (σ R rs ) provided in Table 4 represent one standard deviation derived from the resultant R rs distributions. The derived uncertainties provide guidance on expected random variability in heritage Landsat products compared to those derived from OLI and MODIS. In general, the uncertainties in TM are two-to-five times higher than those of OLI. Given the magnitude of the R rs spectrum, uncertainties in TM-, ETM+-, OLI-, and MODIS-derived R rs products are, on average,~25%, 27%,~8%, and~2%, in the blue and green bands, respectively. This implies that subtle changes in R rs , i.e., < σ R rs , leading to small changes in water quality parameters, may not be detected. In inland and nearshore coastal waters (e.g., bays, estuaries) where concentrations of TSS and Chl are often much higher than those in coastal ocean waters, highly precise concentrations (at levels expected from ocean color missions) may not be required. Therefore, end-users may utilize heritage R rs products with the "fit-for-purpose" concept in mind, i.e., products could be valuable for one application but may not be viable for another.
In addition, the choice of algorithm may alter how the noise is propagated to end products. For example, band-ratio algorithms [75,76] amplify the uncertainties. It is, thus, critical that users apply smoothing techniques, i.e., spatial filtering, (available to users by default via SeaDAS) to reduce the uncertainties. Assuming a 7 × 7-element (uniform) averaging window, σ R rs in Table 6 are reduced by a factor of 1/7. This analysis was carried out for TOA radiance. In practice, n s increases with increase in L t leading to variable uncertainties across a scene. Note, however, that the median L t considered in this analysis is higher than typical radiances encountered in Landsat scenes [74]. Hence, the computed values in Table 6 provide an upper bound for (random component of) uncertainties.
It should also be noted that although ETM+'s SNRs in the visible bands are higher than those of TM, uncertainties in TM-derived R rs are slightly smaller. This is attributed to TM's higher SNR in the NIR band employed in the aerosol removal [64].

Image Artifacts and Processing Flaws
Heritage Landsat data, in particular TM images, often contain artifacts (e.g., striping, coherent noise, memory effects) suggesting that R rs products should always be interpreted with care. Some of these artifacts are evident in Figure 6. Further, we found anomalies in TM products within the 1990-94 interval. This is highlighted in Figure 13 where time-series of R rs products over fairly optically stable inland waters [59], i.e., Crater Lake and Lake Tahoe, are illustrated. Further analyses of L t and other intermediate products (e.g., L r ) indicated unusually high L t in the red, NIR, and SWIR-I bands. These high radiometric responses yield invalid R rs products [64]. No such anomalies were identified for ETM+ time-series, i.e., fewer artifacts present in ETM+ data. Nonetheless, occasional noisy retrievals were observed in time-series plots. Another observation was nonlinearity in response when adjusting vicarious gains in Section 2.3. This was, in particular, confirmed in the ETM+ matchup analysis (Table 5). Overall, the products may be used with caution in oligotrophic (e.g., Crater Lake), mesotrophic (e.g., Lake Tahoe), and CDOM-rich waters. Table 6. The SNR, TOA radiance (L t ), and the uncertainties in R rs (σ R rs ) associated with TM, ETM+, OLI, and MODIS. σ R rs represents one standard deviation corresponding to the R rs distributions derived from many SeaDAS runs for a small OLI subset. Note that the aerosol removal was conducted using the 840-1676 band combination and OLI's SNR from [74] were scaled to provide comparable SNRs estimated for typical radiances reported in Reference [5] for TM, ETM+, and MODIS. The units for L t and σ R rs are wm −2 µ −1 sr −1 and sr −1 , respectively. Band centers refer to TM channels. To infer uncertainties in products, we quadratically added up systematic and random errors. It is, in general, difficult to fully characterize systematic errors for heritage sensors/products. However, according to the matchup analyses (Table 4), intercomparisons with MODIS (Table 5), and SNR assessments (Table 6), it is possible to speculate a maximum uncertainty estimate of 0.0019, 0.0018, and 0.0012 [1/sr] in the blue, green, and red channels of TM and ETM+ products, respectively. These estimates are calculated by taking the average of RMSEs in Tables 4 and 5. Also, note that these maximum uncertainties are estimated recognizing (a) implementation of smoothing filters for reducing random noise contribution and (b) uncertainties in the matchup/intercomparison analyses. The changes and/or anomalies larger than the above threshold are expected to be detected in the heritage products. According to the matchup analyses (Section 3.2), the quality of the TM-derived R rs (660) were found to be reliable for relatively highly backscattering waters where R rs (660) > 0.004 [1/sr]. Our processing system may also contribute to uncertainties. For instance, it is likely that the retrievals over Crater and Tahoe lakes are affected by imperfect corrections of water-vapor transmission. Bear in mind that TM and ETM+'s NIR channels (Figure 1), i.e.,~140 nm wide, contain a water absorption feature centered around~820 nm, whereas OLI's NIR band is relatively narrow (Figure 1). Furthermore, some of the existing noise in the analyses may be attributed to failure in cloud-screening/-masking conducted using thresholds adopted from ocean color data processing. There are also inaccurate retrievals in periods where CDOM loads are high leading to the underestimation of R rs . More research must be dedicated to these topics to enhance these historic R rs products.

Adjacency Effects
The adjacency effect (AE) is another prime topic when using Landsat data in inland waters. Such signal complicating R rs retrievals were pronounced over Onondaga Lake (NY). To qualitatively evaluate presence/absence of adjacency effects, the Rayleigh-corrected (L rc ) data over three sites, including Onondaga Lake (NY), Chowan River (NC), and Boston Harbor (MA) were analyzed. The exact locations for these sites are (43.0919N, 76.2145W), (36.1763N, 76.7378W), and (42.3401, 70.9824W), respectively. To do so, below index (AE_Ind) was computed AE_Ind = L rc (NIR) − L rc (Red) L rc (NIR) + L rc (Red) (6) Figure 13. The time-series of R rs products in the blue bands over optically stable waters of Crater and Tahoe lakes. The overall signal level is very similar in time. A potential anomaly found for TM products is highlighted in the Crater Lake time-series.
Although AE_Ind (ranging from −1 to 1) includes aerosol contributions, it provides insights into whether the L rc spectrum deviates from expected trends, i.e., a monotonically decreasing one towards longer wavelengths. Therefore, the higher the AE_Ind, the stronger the NIR signal is, potentially alluding to the presence of adjacency effects. The time-series plots of AE_Ind extracted from Boston Harbor, Chowan River, and Onondaga Lake are illustrated in Figure 14. From the analyses of the R rs data products (not shown here) over the Boston Harbor and the corresponding time-series shown in Figure 14, it appears that there is negligible adjacency effect. Considering the plots associated with Boston Harbor as the reference, it is found that there is AE in some instances of Chowan River retrievals. The impact of AE is much more pronounced over Onondaga Lake where AE_Ind is frequently very close to zero and even positive for some instances. Since the topography around Onondaga Lake and Chowan River is fairly flat and the two systems are comparable in their morphology, i.e., <2 km wide, it can be concluded that environmental conditions (e.g., wind, humidity, turbulence at land-atmosphere interface) play a major role in the magnitude of adjacency signal. Note further that the increased in the 1992-1996 period (marked with ovals) verifies a potential issue in TM products (as in Figure 13), which requires further investigation. Since products are not corrected for aerosol contributions, irregularities in Landsat-5 orbit may explain such anomalous trends in time-series analyses that do not accounted for variations in solar angles/radiation [77].
While with only six spectral measurements of TM and ETM+, it is nearly impossible to remove adjacency effects, it is certainly possible to minimize such effects with prior knowledge of the spectral characteristics of surrounding land targets, dominant environmental conditions and topography. Therefore, it is cautioned that the adjacency signals may not be assumed for all inland water sites (simply because they are small or narrow water bodies). For instance, land signals were not identified over San Joaquin River (CA), which is narrower than Onondaga Lake or Chowan River. Extensive research is required to identify water bodies susceptible to these effects, understand the dominant environmental conditions, and minimize them for the past and existing multispectral missions. With future hyperspectral missions planned/proposed for the next decade, it may be possible to more accurately account for these unwanted signals. Figure 14. The AE_Ind computed for three sites to analyze presence/absence of adjacency effects (AE). The higher the index, the larger the magnitude of AE is. The Onondaga Lake has the highest and most frequent presence of AE whereas the effect is deemed to be negligible or absent over Boston Harbor. Note that the NIR bands of TM and ETM+ are much wider than that of OLI. This has likely resulted in smaller OLI-derived AE_Ind. The higher-than-normal AE_Ind highlighted with ovals may be due to abnormalities in Landsat-5 orbit in early/mid 90's [77].

Conclusions
This work described the potentials and limitations of heritage Landsat missions for use in aquatic studies. Here, we demonstrated that the Landsat-5 and Landsat-7 data records, when corrected for atmospheric effects and vicariously calibrated via the algorithms and software available in the SeaWiFS Data Analysis System (SeaDAS), yield remote-sensing reflectance (R rs ) products that, on average, agree with those derived from in situ measurements and/or other ocean color products. We further proposed a slightly different strategy to account for L w (NIR) for TM and ETM+, which were/are missing the 443 nm channel. With careful regional or global algorithm developments, the processing system has enabled the creation of a long-term record of aquatic science products since 1984. Nevertheless, larger uncertainties in TM and ETM+ due to random and systematic noise, along with the availability of only three visible spectral bands, limit the choice of algorithms. Hence, algorithm developments for estimating turbidity and concentrations of TSS may be conducted with care. Following our analyses of several water bodies in North America, the TM and ETM+ data products are recommended to be developed and used in moderately to highly turbid waters (i.e., R rs (660) > 0.004 [1/sr]), where instrument artifacts are less noticeable. Overall, although the uncertainties in TM and ETM+ products are higher than those of OLI, they are still valuable assets for retrospective studies. This may include assessing changes in water quality of lakes, reservoirs, water supplies, and coastal receiving waters due to short-/long-term landuse/landcover changes as well as examining climate variability (i.e., weather patterns) and its impacts on water quality (e.g., location, frequency, and persistence in historic harmful algal blooms). There is also a need for the community to measure and quantify optical properties of water constituents in these poorly known aquatic systems. A more extensive database of in-water optical properties will enable developing regional or global long-term aquatic science products. Future research should focus on (a) developing algorithms that are less prone to random noise, (b) enhancing atmospheric correction to account for non-negligible near-infrared reflectance in extremely turbid or CDOM-rich waters, and (c) characterizing adjacency effects under various environmental/climatic conditions.