A Framework of Generating Land Surface Reﬂectance of China Early Landsat MSS Images by Visibility Data and Its Evaluation

: The Landsat time-series dataset is one of proposed framework.


Introduction
As one of the most successful remote sensing datasets, the Landsat time-series dataset is widely used in land surface research, thanks to its long time-series and products [1]. Using Landsat Ecosystem Disturbance Adaptive Processing System (LEDAPS) for TM/ETM+ of Landsat 4, 5, and 7 and Land Surface Reflectance Code (LaSRC) for OLI of Landsat 8, a global on-demand atmospherically corrected surface reflectance product capability is reached currently [2][3][4]. However, the Landsat 1-5 Multispectral Scanner System (MSS) images acquired from July 1972 are not as easy to use as the later Landsat data due to the challenges of radiometric calibration [5,6], georegistration [7,8], and atmospheric correction [9,10]. Since surface reflectance is always considered a necessary and minimum In contrast, the TM sensor onboard Landsat 4-5 presents seven bands at 30 m resolution and 8-bit quantization, with different bandpass positions.The main differences between MSS and TM images of Landsat Collection 1 Archive are shown in Table 1. The relative spectral responses (RSRs) of MSS and TM are downloaded from the USGS website (https: //landsat.usgs.gov/spectral-characteristics-viewer, accessed on 3 April 2022). Figure 1 shows that there are noticeable differences between the relative spectral response of MSS and TM, but the MSS sensors of Landsat 1-5 are slightly varied.  In 2016, the USGS reorganized the Landsat archive into a tiered collection management structure titled Landsat Collection 1 [1]. The Landsat 1-5 MSS data is resampled to 60 m resolution and 8-bit quantization and processed into Collection 1 through geometric registration [7] and radiometric calibration [6]. The geometric registration and radiometric calibration of MSS images are more difficult than later Landsat 4-5 TM images, Landsat 7 ETM+ images, and Landsat 8 OLI images. Only 0.61% of the MSS images can be processed to Tier 1, with tolerances of ≤12 m radial root mean square error (RMSE), while the percentages of TM, ETM+, and OLI are 59.63%, 70.04%, and 57.65% [34]. The absolute radiometric uncertainties of Landsat 1-5 MSS are 6-18%, while TM, ETM+, OLI are, respectively, 5-7%, 4%, and 3% [34].
The Landsat 4-5 satellites carried both MSS and TM sensors, which may simultaneously observe the Earth. The pair of MSS and TM images have the same observation time, target, and atmosphere condition, providing us cross-validation probability. In this research, we choose cloud-free images of Landsat 1-5 MSS Collection 1 in China during 1982-1995, with various land cover types, and we specially selected 479 pairs of MSS and TM images as cross-validation.

Landsat 4-5 LEDAPS Surface Reflectance Product
Landsat 4-5 LEDAPS Surface Reflectance Product is produced by the USGS ondemand processing system EROS Science Processing Architecture using LEDAPS. LEDAPS adopts additional water vapor, air pressure, and air temperature data from the National Centers for Environment Prediction (NCEP), ozone from the NASA Earth Probe Total Ozone Mapping Spectrometer, and topography. LEDAPS uses the DDV method for AOD retrieval without auxiliary aerosols data [35].
The Landsat LEDAPS SR product is evaluated completely and thoroughly with good precision [2,3,36]. Using AERONET-AOD as true value, TM bands' average uncertainty (U) varies from 0.0041 (red band) to 0.0070 (SWIR-1 band) in reflectance, while TM bands' average uncertainty (U) varies from 0.009 (red band) to 0.017 (SWIR-1 band) in reflectance, using MODIS surface reflectance (after VJB Bidirectional Reflectance Distribution Function adjusted) as true value. As the MSS LSR data generated by our proposed framework are much more likely to be used with the LEDAPS product to form a long time series, we choose the LEDAPS SR product of Landsat TM as the truth value.

Integrated Surface Database
The Integrated Surface Database (ISD) consists of global hourly meteorological observations [37]. The earliest records of ISD can be traced back to the year 1901. Figure 2 shows the stations of ISD in 1973 in or near China.   ISD observes numerous phenomena, e.g., wind, temperature, pressure, visibility, and precipitation. The quality control algorithms integrated the data from over 100 original data sources and compiled them into ASCII format. In this research, we extract visibility records according to the image observation time and center coordinates by a spatial and temporal search radius.

NCEP Reanalysis
The NCEP/NCAR Reanalysis 1 is a widely-used dataset based on data assimilation [38]. The data are generated as 2.5 × 2.5 degrees global grids from 1948, covering air temperature, relative humidity, and specific humidity. This research uses daily specific humidity to calculate the column water vapor and surface air temperature and surface relative humidity to calculate surface water vapor pressure.

China Land Use/Cover Change (CNLUCC)
As the aerosol type is highly associated with land cover, we use China's Multi-Period Land Use Land Cover Remote Sensing Monitoring Dataset (CNLUCC) to determine the aerosol type of the MSS and TM images [39]. The CNLUCC dataset covers seven periods, including the late 1970s, late 1980s, mid-1990s, late 1990s, 2005, 2010, and 2015, and classifies the land cover into six classes and 25 subclasses by visual interpretation [39][40][41].

Overview
The framework of generating the LSR of China's early MSS images by visibility data is shown in Figure 3. Similar to the later sensors, the framework includes the radiometric calibration to convert DN values to top-of-atmosphere (TOA) reflectance or radiance. We used the radiometric calibration coefficients of the metadata file (MTL.txt) provided by the Landsat Collection 1 Level 1 Archive, which was updated in 2019 [6]. Moreover, the observation date and time, geographic coordinates of the image center, and solar azimuth/elevation angle are all extracted from the metadata file.
The atmospheric RT model computed the transmission, intrinsic reflectance, and spherical albedo terms. Then, atmospheric correction coefficients are provided to transfer TOA radiance or reflectance to surface reflectance for every band. Since the visibility records are independent of the remote sensing observation, the quality assessment (QA) band is unnecessary, which helps extract the dense dark vegetation pixels for the DDV method.
The key of the framework is the module of retrieval and calculation of atmosphere parameters. The 6S model allows a direct visibility input, as an empirical model is built in the code. For contrast, we also applied a Qiu model [29], a physically-based model with an empirical correction function for AOD retrieval. The input parameters of empirical and physically-based models are different.
The visibility records are retrieved by search radiuses of 2 degrees spatially and 2 h temporally. All the visibility records observed two hours before and after the acquisition time within the 2 degrees area are retrieved. If the number of records retrieved is less than 4, the search radiuses increase with the maximum of 4 degrees spatially and 3 h temporally. The maximum visibility record of all the records is used, for the lower records are vulnerable to local environmental conditions. For those images with no visibility records retrieved, default visibility of 23 km is used. The content of WV is derived from NCEP/NCAR Reanalysis 1, and the ozone concentration is set as 0.345 atm-cm, as the Total Ozone Mapping Spectrometer (TOMS) data was not available during the 1970s. The aerosols model is precalculated by the land cover of China in 1980 and classified into four models, namely, continental, maritime, urban, and desert.

6SV Atmospheric Correction Model
The 6SV radiative transfer model expresses the atmospheric correction equation in the Lambertian condition with no adjacency effects [4,20], as follows: where ρ TOA is the top-of-atmosphere (TOA) reflectance, ρ atm is the atmosphere intrinsic reflectance, ρ s is the land surface reflectance. Tr atm is the total atmosphere transmission effected by AOD and water vapor (WV), S atm is the atmosphere spherical albedo, and Tg is the gaseous transmission of absorbing gases in the atmosphere, including water vapor (Tg wv ), ozone (Tg O 3 ), and other gases (Tg og ).
θ v , θ s , and φ are the view zenith angle, solar zenith angle, and relative azimuth angle, m equals 1/cos(θ s ) + 1/cos(θ v ), P is the pressure that influences the number of molecules and the concentration, U is the integrated content, U H 2 O and U O 3 for water vapor and ozone, respectively, and Aer is the parameters of aerosol.  In the 6S model, an empirical model was proposed to describe the relationship between AOD and visibility. Two profiles of 23 km visibility and 5 km visibility were derived [42], and the AOD at 550 nm for visibility 5 km and 23 km were, respectively, calculated with the parameters defined [43][44][45]. For another visibility, a new particle density profile was computed by the profiles defined by the visibility of 5 km and 23 km.

AOD Retrieval Model Based on Visibility
In addition to the empirical models, several physically-based models were proposed on the concept of visibility proposed by Koschmieder [26]. Combined with the Elterman model describing the relationship between visibility and extinction coefficient of aerosols [27], a corrected scheme for China is proposed [29], as follows: where τ a is the total optical depth of atmospheric molecules and aerosol particles, V is the horizontal visibility, , where P w is the water pressure of the visibility measurement location in hPa. The surface water pressure of stations is also estimated using NCEP/NCAR Reanalysis 1.
Though researchers proposed other advanced models to retrieve AOD from visibility, Qiu's model is more convenient and is easier to calculate, as all the input parameters can be calculated based on the NCEP/NCAR Reanalysis 1 and ISD. Therefore, we applied Qiu's AOD retrieval model in this research.

Evaluation and Uncertainty Analysis
Since rare simultaneous in situ LSR observation records were acquired in China about 30-50 years ago, we chose an indirect method to evaluate the MSS land surface reflectance product. Thanks to the simultaneous observation of MSS and TM in Landsat 4-5, the TM LEDAPS product was used as the truth value. Uncertainties are brought by several sources in the cross-validation, shown in Table 2 in detail when comparing the measured value with the truth value. • means that the factor influences the validation results. × means that the factor does not influence the validation results.

Comparison Method
Three statistical metrics are used to evaluate the Landsat surface reflectance product, namely, accuracy, precision, and uncertainty (APU), as follows: where the ε i = ρ i − ρ i,re f erence is the absolute error, and ρ i,re f erence is the LEDAPS surface reflectance, as the truth value. The global surface reflectance products of Landsat TM/ETM+ and OLI generated by LEDAPS and LaSRC are evaluated by these metrics [3,4]. Using these metrics can make the evaluation of the generated Landsat MSS LSR data consistent and comparable with the widely-used Landsat LSR products.

Uncertainty Brought by RSR
The differences between RSRs of TM and MSS may result in the least uncertainty when comparing the MSS LSR and TM LSR. The effective reflectance over MSS and TM spectral range can be calculated as follows: where ρ H is the hyperspectral reflectance measured or simulated, while ρ B is the broadband reflectance, and the RSR is the relative spectral response of a specific band. By the broadband reflectance synthesized by hyperspectral reflectance, only the difference between RSRs is considered. Considering the land cover type, we assign the ECOSTRESS spectral library (v1.0) and PROSAIL simulated profiles to ρ H . The ECOSTRESS spectral library contains over 3000 spectra, including minerals, rocks, artificial materials, and vegetation, derived from the Advanced Spaceborne Thermal Emission Reflection Radiometer (ASTER) spectral library [46]. A part of the ECOSTRESS spectra provides records in the visible, near-infrared, and shortwave infrared bands, ranging from 0.35 µm to 2.4 µm. However, the ECOSTRESS spectral library only contains leaf spectral profiles, while the remote sensing sensors observe the vegetation canopy from the satellites. Therefore, we used the PROSAIL model [47] to simulate canopy spectra under various conditions. Table 3 shows the inputs of the PROSAIL model, generating 100,000 profiles. Figures 4 and 5 show the uncertainty of the comparison, using TM effective reflectance as the truth value. Both the ECOSTRESS and PROSAIL datasets show a high level of bad evaluation with the largest uncertainty in the MSS NIR-1 band. The RSR shows the least correspondence (shown in Figure 1). The uncertainty ranges from 0.0062 (MSS red band) to 0.0472 (MSS NIR-1 band) in the ECOSTRESS dataset and ranges from 0.0056 (MSS Red Band) to 0.0986 (MSS NIR-1 band) in the PROSAIL dataset.   The effective reflectance calculated by RSRs and spectral dataset performs much better than the actual image data (such as in the later Landsat 8 and Sentinel 2 remote sensing image pairs [48]) because actual image pairs are associated with many more factors influencing the uncertainty. As the endmember and abundance vary in actual remote sensing pixels, the uncertainty brought by RSRs may differ from the simulation. In general, the effective reflectance of TM and MSS shows good consistency in the red band but the worst consistency in the NIR band.

Uncertainty Brought by Georegistration and Scale Effects
When comparing MSS scenes and TM scenes, the error of geographical factors must be considered. A common practice is to resample the higher resolution image to the lower resolution image [48][49][50], if there are resolution inconsistencies for the image pairs. This step of processing is usually helpful but under the condition of higher georegistration error.
The following procedure extracts the pixel values of a pair of MSS and TM scenes (shown in Figure 6). First, generate random points within the range of MSS and TM scenes. The number of random points generated is 100,000 per image. Then, extract the pixel value of the MSS scene and TM scene according to the point, respectively. Finally, calculate the variance of the receptive field (near the TM pixel extracted) and set the upper limit of the variance. The receptive field is an n × n pixels window, with n assigned to 1, 3, 5, 7, 9, and 11 and the upper limitation of the variance v upperbound is set as 0.002, 0.004, 0.006, 0.008, 0.01, 0.02, 0.03, 0.04, 0.05, 0.06, 0.07, 0.08, 0.09, and 0.10. Only the image in which the final number of random points is over 1000 is applied for the evaluation. In addition to A, P, and U, the coefficient of determination is used to evaluate the correlation of the MSS and TM pixels extracted, as follows: where ρ TM and ρ MSS are the reflectance extracted by the random point, while ρ TM is the average of ρ TM .  Figures 7 and 8 show the results. By controlling the receptive field area and the upper limits of variance, we can quantitatively estimate the error of geographical factors and find the optimal receptive field and upper limitation of variance. For a certain upper limit v upperbound , the larger receptive field causes the lower uncertainty. However, the maximum of R 2 is not reached with the minimum v upperbound . Once the filter regulation of n and v upperbound is too strict, the number of random points used could become much smaller. A large-scale and homogeneous region is always associated with a minority of land cover, resulting in a localized distribution of the random point in the scatter diagram. Therefore, we regard n = 3 and v upperbound = 0.01 as the proper value because the maximum of R 2 is always reached, which is a compromise of screening regulations and screening results.

Uncertainty Brought by AOD Estimation
By the filter regulation proposed, the final A, P, and U are calculated and are shown in Table 4, for the total 479 pairs with n = 3 and v upperbound = 0.01, and the evaluation of TM LEDAPS products [3]. Not only are the MSS LSR data evaluated, but the TM LSR produced by the visibility-6S method and visibility-Qiu method are evaluated as well, because the validation of LSR produced by the visibility-6S method and visibility-Qiu using TM LEDAPS as the truth value can directly figure the uncertainty brought by different methods of AOD estimation, without any error from RSR difference, radiometric calibration, and geographical factors. As shown in Table 4, the best performance of the visibility method is acquired by the visibility-6S method, with the uncertainty of 0.020 (TM green band), 0.014 (TM red band), 0.009 (TM NIR band), and the uncertainty of 0.029 (MSS green band), 0.026 (MSS red band), 0.033 (MSS NIR-1 band), and 0.023 (MSS NIR-2 band). The uncertainty also decreases from the green band to the NIR band for both visibility methods, except the MSS NIR-1 band. As A increases from the green band to the TM NIR band, an overestimation of AOD by visibility method may be systemic bias. As the precision promotion of atmospheric correction could not directly reduce the uncertainty brought by the other factors, we divide the uncertainty sources into atmospheric and non-atmospheric factors. The uncertainty of non-atmospheric factors could be computed with the following two assumptions. One is that the atmospheric factors and non-atmospheric factors are independent. The other is that the MSS uncertainty brought by the atmosphere equals the TM uncertainty brought by the atmosphere for the similar bands. The results (shown in Table 5) show that the uncertainty brought by atmospheric factors is smaller than the uncertainty brought by non-atmospheric factors during the cross-comparison, which also suggests that the non-atmospheric factors cannot be ignored when using the MSS LSR data for time-series analysis.

Other Uncertainty Sources
Moreover, it is necessary to note that the uncertainty of the inter-comparison is a combination, including the uncertainty of the truth value. As reported, the dense dark vegetation method for AOD retrieval may not be suitable for the clamps in Australia [51,52] or for other positions where the vegetation is not dense and dark enough [53]. Table 6 shows the APU of the four seasons, respectively. The performance of the visibility method of winter (December, January, and February, DJF) is much worse than the other seasons. The reason might be explained by several aspects: (1) The dense dark vegetation pixels are insufficient, as the major vegetation is broad-leaved trees in northeast and North China. The surface reflectance of LEDAPS treated as the truth value may have higher uncertainty; (2) The solar elevation angle in winter is higher than in other seasons, causing a longer optical path and increasing the uncertainty brought by AOD estimation.
Additionally, occasional anomalies may cause an increase in uncertainty, especially in the earlier sensors. Figure 9 shows an example of oversaturation and detector striping in MSS images. Occasional anomalies occur more frequently in the early sensors and affect the final validation results, and the influence needs to be estimated quantitatively for further research.

Application of Time-Series SR in Spectral-Stable Land Cover
To examine the stability of the visibility-based surface reflectance of MSS in practice, we selected five ROIs across China, extracted the values of MSS, TM, and MODIS reflectance of the ROIs, and conducted time-series assessments. The ROIs cover three kinds of land cover, i.e., water body (ROI 1), desert (ROI 2 and 5), and evergreen vegetation (ROI 3 and 4), shown in Figure 10. The MSS LSR data are the atmospherically corrected image based on the visibility-6S method, which performs better than the visibility-Qiu method in the validation mentioned above. The TM LSR data are from the LEDAPS data, and the MODIS data used are from the MOD09A1.006 product, an eight-day composition with a 500 m resolution. The ROIs are selected by the time-series reflectance by the MODIS during the 20 years using the Google Earth Engine. Considering the resolution inconsistency of MODIS, TM, and MSS, though the ROIs are selected for spatial homogeneity in MODIS images, they are always heterogeneous in the smaller pixels of TM and MSS images. Here, we used the median value of ROIs of all three LSR data for times-series assessment, which can also remove many outliers caused by MSS defective sensor. All the remote sensing data used are free from cloud, with strict control by the cloud coverage and artificial visual interpretation.
Though the surface reflectance cannot be less than zero in practical terms, the value of remote sensing LSR product may sometimes be less than zero in some dark pixels, for the overestimation of AOD and WV causes the overestimation of path radiance. In addition, the shadow of cloud and topographic relief may cause the LSR to be less than zero. The negative values are found in all three LSR data used, without artificial interventions in later assessments.

Water Body
ROI 1 is an area of water body, Qinghai Lake, in Qinghai Province, between 36°55 -37°3 N, 100°0 -100°10 . Qinghai Lake is the largest saline lake in China and is also one of the remote sensing test fields of China, lying on the northeastern Tibetan Plateau at the average altitude of 3200 m [54]. As Qinghai Lake freezes in winter, all the remote sensing images used for ROI 1 are acquired during the unfreezing stage. Figure 11 shows the time series of the MSS LSR, TM-LEDAPS, and MODIS-MOD09A1 (simultaneous with the TM image). Among all the MSS data exhibited in Figure 11, outliers are found in the image of LM02_L1GS_143034_19750621_20180425_01_T2, especially for the green band. The TOA reflectance of ROI 1 calculated by the rescaling coefficients provided by the MTL file is less than zero. It may be caused by inappropriate radiometric calibration coefficients or invalid DN values derived from unknown sensor issues, which needs further research.  Removing the outliers mentioned above, Table 7 shows the statistics of time-series reflectance of ROI-1 of the three sensors, including the minimum, median, mean, maximum, and standard deviation (SD). The reflectance of the MSS green band ranges from −0.095 to 0.007, which is smaller than the reflectance of green bands of TM (ranging from 0.026 to 0.062) and MODIS (ranging from 0.008 to 0.062). In red band, MSS reflectance ranges from −0.026 to 0.021, between TM reflectance (ranging from 0.010 to 0.048) and the MODIS reflectance (ranging from −0.010 to 0.026). The reflectance of MSS NIR bands ranges from 0.000 to 0.025 (MSS NIR-1 band) and from 0.002 to 0.029, which is similar to the TM NIR bands (ranging from 0.006 to 0.044) and is greater than the MODIS NIR band (ranging from −0.010 to 0.021).

Desert
ROI 2 and ROI 5 are areas of desert, as stable high-reflectance regions. ROI 2 is located in northwestern Taklimakan Desert, in Xinjiang Province, between 39°26 -39°54 N, 81°24 -81°54 E. ROI 5 is located in the center of Badain Jaran Desert, in Inner Mongolia, between 40°33 -40°41 N, 103°18 -103°31 E. Figures 12 and 13 show the time series reflectances of ROI 2 and ROI 5, which vary in a larger range than water body. Tables 8 and 9 show the detailed statistics of time-series reflectance of MSS, TM, and MODIS in ROI 2 and ROI 5. In ROI 2, the differences of each group of statistics are smaller than the validation result in Section 4, which shows a good time continuity. Nevertheless, in ROI 5, the reflectance of green bands shows an obvious difference, which is found in the comparison not only between MSS and TM reflectance but also between TM and MODIS reflectance. For red and NIR bands, the reflectance of MSS shows a better consistency compared with the green band.

Vegetation
ROI 3 and ROI 4 are areas of vegetation. ROI 3 is located in Fujian Province, between 26°25 -26°35 N and 114°19 -114°29 E, and ROI 4 is located in Jiangxi Province, between 25°31 -25°39 N and 117°5 -117°12 E. The major vegetation of the two ROIs is subtropical evergreen broad-leaved forest. Furthermore, we select the months with small changes of vegetation spectrum according to the TM observation. Figures 14 and 15 show the time series reflectance of ROI 3 and ROI 4. Times series reflectance of TM and MODIS shows stable lines in the green and red bands, while the reflectance of the NIR band shows a wide distribution over observation date. For the green band, MSS reflectance is less than TM and MODIS reflectance in both ROI 3 and 4, which is also found in the evaluation in Section 4. For the red band, the MSS reflectance performs better than the validation result summarized in the previous section.    LEDAPS is a successful remote sensing image processing system that generates TM and ETM+ LSR products of Landsat 4, 5, and 7, based on the 6S model. Considering the lack of ozone data in the early years of remote sensing, the new framework proposed in this paper set the ozone data as a contrast value of 0.345 atm-cm. Topographic details within a scene of an MSS image are ignored due to our limited computing capability.
The major difference between the proposed framework and LEDAPS is that the groundbased visibility record is used as the input of 6S instead of the image-based retrieval of AOD. The benefit is that the image-based AOD retrieval method of MSS images is not needed. The image-based AOD retrieval method of MSS images is difficult to study today due to the limitation of AOD measurement in the 1970s and the inherent defects of MSS images. In addition, the precision of the image-based AOD retrieval method is influenced by the reported issues of image quality.
Moreover, LEDAPS uses the QA band as input in the process of generating SR to obtain the cloud mask [55], but the cloud mask of MSS images' QA band seems problematic [56]. The ground-based visibility records are independent of the remote sensing observation, so our proposed framework does not need the cloud mask.
As shown in Section 4, the results generated by our proposed framework have a larger uncertainty with a systemic bias compared to the LEDAPS LSR product of Landsat TM. The visibility-based AOD methods are regional, while artificial observations and weather conditions are also reported as error sources [57]. In our proposed framework, using a visibility record to generate MSS LSR data compromises accuracy and solvability. Although the proposed framework has a larger uncertainty than the LEDAPS, the systemic bias can be solved by the accumulating records of ground-based visibility and AOD results in recent decades.

Analysis of Potential Problems in Time-Series Analysis of Early Years
In addition to the uncertainty brought by atmospheric correction, a much larger uncertainty was found when directly comparing the MSS LSR generated by our proposed framework and the LEDAPS product of Landsat TM. As shown in Sections 4 and 5, the non-atmospheric factors (e.g., the difference of RSRs of TM and MSS, the georegistration uncertainty, the radiometric calibration uncertainty, and image noises) bring more uncertainties than atmospheric factors (e.g., AOD and WV) overall. It proves that non-atmospheric factors cannot be ignored in time-series research using MSS data.
Considering the uncertainties brought by all the potential factors, the red and NIR-2 bands among all the MSS LSR bands and data acquired in spring (MAM), summer (JJA), and autumn (SON) among all the seasons are recommended for usage.
Additionally, the quality of MSS images varies considerably. The oversaturated pixels are shown in the QA band of MSS images, while some issues, including detector striping, are probably not shown in the QA band. In Sections 5, outliers in ROI 1 are found with unknown reasons, while some bands of the MSS images were found to be missing. The generated MSS LSR data need to be selected artificially for time-series analysis.

Future Work
The AODs retrieved by station-based visibility and satellite data have differences and inconsistencies. The station-based visibility records are obtained in spatial points with a specific time interval. At the same time, the image-based AOD is retrieved at the regional scale with different measure times from the visibility records. In the proposed framework, we use the most straightforward method, using the maximum value from a search radius of space and time for a scene of MSS images. Another method is to use spatial and temporal interpolation methods. The consistencies between AODs retrieved by ground-based visibility and satellite still require further research.
Another research direction is the evaluation. As the proposed framework needs land cover data from the 1970s as input, the evaluation is limited to China. However, the amount of Landsat MSS images covering China is unbalanced. The number of pairs of simultaneous images of MSS and TM used in evaluation became much lower than the total number of MSS and TM images, as the MSS and TM sensors were not always simultaneously turned on. The evaluation is not as thorough as the later Landsat 5, 7 [3], and Landsat 8 [4]. The study area will be extended to the whole world for future research.
Moreover, the usability of the MSS land surface reflectance product relates not only to atmospheric correction but also radiometric calibration, georegistration, denoising, cloud mask, and other factors. As the MSS LSR data are likely to be used in the time-series analysis together with other Landsat data, the consistency and correction of Landsat MSS and TM is also an important topic for future research. Though the work is largely promoted recently in radiometric calibration [5,6] and georegistration [8], more efforts are continuously needed to constitute the long-time-series Landsat MSS LSR archive.

Conclusions
The Landsat MSS dataset is the only multi-spectral global observation record by remote sensing in the 1970s, which plays a vital role in tracing time-series research to the 1970s. However, no MSS LSR product is currently publicly available. We propose a framework to generate the MSS LSR data using ground-based visibility records as input, overcoming the congenital deficiencies of the MSS sensor. The results are evaluated by the simultaneous observation by MSS and TM sensors in Landsat 4 and 5 using Landsat 4-5 TM LEDAPS product as the truth value. The evaluation results show that uncertainties decrease from the green band to the NIR band with a systemic bias. In addition, the non-atmospheric factors (e.g., the difference of RSRs of TM and MSS, the georegistration errors, the radiometric calibration uncertainty, and image noises) bring more uncertainties than atmospheric factors (e.g., AOD and WV) overall. Considering that both the sensors and preprocesses of MSS are comprehensively inferior to the later sensors and that the uncertainties brought by non-atmospheric factors cannot be reduced by atmospheric correction, the evaluation results are comparable with the LEDAPS LSR product. We apply the MSS LSR data generated by the proposed framework on time series analysis in five ROIs of the spectral-stable land cover in China for all the MSS sensors. The application demonstrates the potential and promise of the MSS LSR data generated by the proposed framework. In general, the proposed framework can effectively generate the MSS LSR data of the 1970s.