Normalized Di ﬀ erence Vegetation Index Continuity of the Landsat 4-5 MSS and TM: Investigations Based on Simulation

: Landsat 4-5, built at the same time and with the same design, carrying the Multispectral Scanner System (MSS) and the Thematic Mapper (TM) simultaneously, jointly provided observation service for about 30 years (1982–2013). Considering the importance of data continuity for time series analyses, investigations on the continuity of the Landsat 4-5 MSS and TM are required. In this paper, characterization differences between the Landsat 4-5 MSS and TM were initially discussed using the synthesized reflectance records generated from a collection of Hyperion hyperspectral profiles which were well calibrated and widely distributed. The difference in near-infrared region mostly contributed to the difference in normalized difference vegetation index (NDVI) between MSS and TM, while the between-sensor difference in red spectrum was relatively minor. Models for transforming MSS NDVI to TM NDVI were proposed, and validated subsequently through cross-validation tests. Furthermore, effectiveness of the transformation models was investigated using eight synchronous observation pairs of the Landsat 5 MSS and TM. On average, the univariate models through ordinary least squares regression (OLS) regression resulted in a decrease about 10% of the median relative difference (MdRD). Meanwhile, the bivariate models improved the NDVI comparability in most cases, especially when the transformation models through ridge regression were implemented. The univariate model through OLS regression could be the only solution for cases when problems of data quality are encountered (e.g., problem in the MSS near-infrared channel (800–1000 nm)). In conclusion, the findings on NDVI transformation models from MSS to TM are valuable for reference, because of the collection of diverse Hyperion hyperspectral profiles used.


Introduction
The Landsat program has been operated since the launch of its first satellite in 1972, collecting space-based imagery with moderate spatial resolution of the Earth's surface. Consequently, the Landsat archive is considered a unique database which is beneficial to both the academic community and society, particularly since the implementation of the free data policy in 2008 [1]. The Landsat program is considered a relatively consistent mission, which will be further sustained operationally by the successors (e.g., Landsat 9 and Landsat 10) [2][3][4]. Compared with the predecessor(s), the successor(s) generally may provide improved land surface information through more advanced instrument(s). Consequently, it may challenge the time series analyses of the Landsat archives for which data continuity is required. Thus, there is a need to define quantitative transformations between Landsat sensors to ensure a long-term archive with consistency [5]. In particular, Landsat 4 and Landsat 5 were built at the same time and with the same design, carrying the Multispectral Scanner System (MSS) and the Thematic Mapper (TM) simultaneously, which provided observation services successively for about 30 years (1982-2013). More importantly, Landsat 5 has been recognized as the longest-operating Earth observation satellite, owing to its provision of high-quality, global data of Earth's land surface for about 29 years (https://landsat.gsfc.nasa.gov/historic-landsat-5-mission-ends/).
Overall, the TM has improved properties in spectral and spatial resolution, compared with the MSS (Table 1), which made different performances of TM for some applications [6][7][8]. Meanwhile, the TM has one near-infrared (NIR) channel covering the range of 760-900 nm, whereas the MSS has two approximately corresponding channels over the NIR region, labeled NIR1 (700-800 nm) and NIR2 (800-1000 nm) respectively (Table 1 and Figure 1). Figure 2 shows the valid archives of Landsat 4-5 MSS and TM over a specific region. The annually continuous records consist of overlap observations by different sensors and platforms, while Landsat 5 contributes most archived data products of this region ( Figure 2). As there has been no MSS instrument onboard the successive Landsat satellites after Landsat 5, comparability between the MSS and the successors (e.g., TM) in observations and derived variables should be understood. Consequently, Landsat 4 and Landsat 5 provided an important overlap of the MSS and TM observations [9]. Considering the importance of Landsat continuity and to make full usage of the Landsat archive, investigations on the continuity of MSS and TM were required.
To our knowledge, comparisons of the MSS and TM were previously shown [6][7][8][9][10]; however, practical applications of the products were relatively limited due mainly to the difficulty in data accessibility [3]. Thanks to the data open policy [1][2][3], the Landsat archive with continuing inputs is freely accessible to all users, which enriched related investigations for historical issues especially by incorporating the archive of MSS and TM [11][12][13][14][15][16]. Although the MSS and TM have been consistently calibrated (with small discrepancy), it must be emphasized that they still differ significantly from the viewpoint of their spectral properties [17]. For example, characterization differences between the Landsat sensors result in discrepancies in land use classification [7,8,15], water quality retrievals [18], and vegetation indices [5,10,19,20]. Nevertheless, the sensor related bias in the normalized difference vegetation index (NDVI) was not considered in several investigations, such as in [15,21].
In this study, characterization differences of the Landsat 4-5 MSS and TM were investigated, mainly in terms of NDVI continuity. NDVI is considered one of the most commonly used vegetation indices [5], which also serves as a key parameter in Landsat higher-level science products currently [22]. To this end, a spectra collection comprising a total of 10,000 hyperspectral profiles which were elaborately selected from the well calibrated Hyperion archive [23], was used to simulate the channel reflectance of the Landsat 4-5 MSS and TM. The Hyperion spectra collection provided a chance to compare the instruments with different characterizations through synthesizing the broad multispectral channels [24][25][26]. Accordingly, characterization difference between MSS and TM were demonstrated, especially for the approximately corresponding channels covering red and near-infrared spectral regions followed by derived NDVI. Furthermore, to ensure NDVI continuity between different observations (i.e., by MSS and TM), transformation models were proposed respectively. The models' performances were compared and discussed in detail, through cross-validation tests and case applications (see Section 3.4. Application Cases). Finally, suggestions and conclusions are presented. Results mentioned in this paper are mainly based on simulation records except the discussion on case applications of transformation models.  1 The TM thermal infrared channel was originally acquired at 120 m resolution. However, products processed before February 25, 2010 are re-sampled to 60 m, and products processed after February 25, 2010 are re-sampled to 30 m. Information in Table 1 1 The TM thermal infrared channel was originally acquired at 120 m resolution. However, products processed before February 25, 2010 are re-sampled to 60 m, and products processed after February 25, 2010 are re-sampled to 30 m. Information in Table 1 was obtained from https://landsat.usgs.gov. The value in parentheses is the spatial resolution correspondingly.  (Table 1) are not shown in this figure. The data were obtained from https://landsat.gsfc.nasa.gov.   (Table 1) are not shown in this figure. The data were obtained from https://landsat.gsfc.nasa.gov.

Spectral Response Function (SRF) of Landsat 4-5 MSS, TM, and Hyperion
The spectral response function (SRF) data of the Landsat 4-5 MSS and TM were obtained at https://landsat.gsfc.nasa.gov. Currently, the SRFs of Landsat 4-5 MSS publicly available are provided with 10 nm for the Green, Red, and NIR1 channels, while with 20 nm for the NIR2 channel. Meanwhile, the TM SRFs are mainly recorded at 1 nm intervals, although some responses are sampled with intervals at 5 or 10 nm (https://landsat.gsfc.nasa.gov). Accordingly, to make the reflectance integration easily applicable, a spline interpolation method was used to resample the SRFs of MSS and TM to a spectral resolution of 1 nm [19]. The choice of interpolation methods affected channel reflectance calculation slightly, with a relative variation of less than 0.02% (in another manuscript submitted).
At the same time, the spectral response function of Hyperion is not publicly accessible. In this paper, for a specific channel i of Hyperion, the SRF was generated through a Gaussian function, using the full width at half maximum (FWHM) and center wavelength (Equation (1)), as mentioned in previous investigations [24,26,27].
where FWHM H i is FWHM of channel i, and λ H iC is the center wavelength correspondingly. The superscript "H" stands for Hyperion. The center wavelength and the FWHM used to get the SRFs of Hyperion were accessed at https://eo1.usgs.gov.

Hyperion Spectra Collection
As a part of the NASA New Millennium Program, the Earth Observing-One (EO-1) was launched in November 2000 [28]. The EO-1 spacecraft carried three primary instruments onboard, including the Advanced Land Imager, the Hyperion, and the Linear Etalon Imaging Spectrometer Array Atmospheric Corrector [29]. In particular, the Hyperion instrument was the first imaging spectrometer, acquiring data from space [28], which is capable of resolving 220 unique spectral channels (covering 400-2500 nm). Each Hyperion channel is provided with averagely 10 nm FWHM with a spatial resolution of 30 m [29]. The Level-1 radiometric product has a total of 242 channels, but only 198 channels are calibrated. In addition, due to an overlap between the visible-NIR and SWIR focal planes, there are 196 unique channels well calibrated (https://eo1.usgs.gov). The calibration of Hyperion was radiometrically stable to within 5.0% over the first eight years [28]. Moreover, the radiometric stability of Hyperion during 2013-2014 was confirmed that it varied within 5.0% over the visible-NIR regions and within 10.0% over the SWIR region [30]. Hyperion profile well calibrated has been proven to be valuable for comparing different instruments with multispectral channels through a simulation strategy [24,25]. The Hyperion hyperspectral spectra have been used in applications for cross-sensor calibration and transformation [23][24][25][26][27][31][32][33].
In this paper, a collection of 10,000 Hyperion profiles covering a wide range of land surface was used to synthesize the corresponding channel reflectance of the Landsat 4-5 MSS and TM. Currently, the collection has also been used for channel reflectance transformation between the Landsat 8 Operational Land Imager and the Sentinel-2 Multi-Spectral Instrument in an initiative program called "Harmonized Landsat Sentinel-2" [23,31]. Details about the Hyperion spectra collection are presented in [23].

Paired MSS and TM Observations of Landsat 5 for Case Studies
The transformation models proposed based on simulation results are further discussed through application cases, in addition to cross-validation tests (see Section 3.3. Transformation Models and Comparison for details). Specifically, eight pairs of the Landsat 5 MSS and TM images synchronously Remote Sens. 2019, 11, 1681 5 of 18 acquired (over the WRS-2 Path/Row 123/032, covering mainly over the Beijing-Tianjin-Hebei region, China) in the Collection 1 Level-1 data product were obtained from https://earthexplorer.usgs.gov/ ( Table 2). The occasionally synchronous acquisitions by MSS and TM onboard Landsat 5 provide a good opportunity to assess the transformation models based on the simulation records. To show the effectiveness of the transformation models correspondingly, pixels with best quality (i.e., are not contaminated by cloud, shadow, saturation, and with no fill, also served as "clear terrain") located within relatively homogeneous area were randomly collected through a sampling procedure similar as in previous investigations [34,35]. Specifically, 10,000 samples randomly selected for each pair of the MSS and TM observations were used in final assessments. For the Landsat Collection 1 Level-1 data product, a quality assessment (QA) band with 16-bit is provided, which allows users to apply per pixel filters and to determine the suitability for application (https://www.usgs.gov/land-resources/ nli/landsat/landsat-collection-1). In this paper, considering the reliability of the QA interpretation, the best pixels were determined according to the QA Band information of TM data (see Section 4.2. Transformation Modeling and Application), during the sampling procedure. Table 2. Archived pairs of Landsat 5 MSS and TM observations used for case applications 1 .

Channel Reflectance Simulation
Comparison of MSS and TM in terms of the reflectance over approximately corresponding channels and derived NDVI was mainly based on synthesized channel reflectance. The same as in previous investigations [26], the following procedures were adopted to obtain channel reflectance for the Landsat 4-5 MSS and TM correspondingly. Specifically, the weights, weights normalization, and reflectance integration as a weighted sum can be obtained respectively by where SRF L Bi (λ) is the spectral response function of the Landsat 4-5 MSS (TM) channel Bi, with the start wavelength and the end wavelength of λ L BiS and λ L BiE respectively. Similarly, SRF H i (λ) is the spectral response function of Hyperion channel i, with the start wavelength and the end wavelength of λ iS and λ iE respectively. W H i and nW H i are the weight and normalized weight for Hyperion channel i, respectively. Re f L Bi is the simulated channel reflectance of the Landsat 4-5 MSS (TM) correspondingly. As Equation (4) shows, Re f L Bi is a weighted sum of all valid Hyperion channel reflectance (Re f H i ) with the center wavelength (λ H iC ) being located within (λ L BiS , λ L BiE ). The superscripts "L" and "H" stand for corresponding channels of the Landsat 4-5 MSS (TM) and Hyperion respectively.

NDVI Estimation of the Landsat 4-5 MSS and TM
In this study, NDVI [36] as one of the most commonly used vegetation indices to describe nature and urban environments [5,21,[37][38][39], was mainly investigated. As Equation (5) shows, NDVI measures reflectivity contrast between the NIR channel and the Red channel. The underlying principle of NDVI is that radiation in the red spectrum is considerably absorbed (or poorly reflected) by chlorophyll in green plants, while radiation in the near infrared spectrum is strongly reflected by the spongy mesophyll leaf structure [36]. Accordingly, the NDVI provides a standardized method of mapping and comparing vegetation greenness retrieved from different remotely sensed observations. However, differences (even minor) in channel characterizations may result in an amplified bias in NDVI among sensors [5,19,20,26,40,41]. The comparability of NDVI retrieved from the Landsat 4-5 MSS and TM was investigated, followed by possible transformation models to ensure the NDVI continuity across different sensors and platforms.
where Re f L red and Re f L nir are the channel reflectance covering the red and near-infrared spectrum regions, respectively. The TM NDVI was generally obtained using the Red channel (B3) and the NIR channel (B4) ( Table 1 and Figure 1), labeled as TM43. Meanwhile, for the MSS, which has two NIR channels located in 700-800 nm (NIR1, B3) and in 800-1100 nm (NIR2, B4) respectively, with a Red channel (B2) ( Table 1 and Figure 1), two MSS NDVIs labeled as MSS32 and MSS42 were estimated accordingly. In previous investigations (e.g., in [10,15,21]), the MSS NIR2 was used in NDVI calculation.

Between-Sensor Difference Measures
The relative difference (RD) can be defined, to measure the individual between-sensor difference of sample j, for the corresponding variable (Var i , i.e., channel reflectance and NDVI).
Furthermore, the overall between-sensor differences for the corresponding variable were estimated using two indicators, including the median difference (MdD) and the median relative difference (MdRD), which can be obtained by where Var S1 ij and Var S2 ij are corresponding values of sample j (j = 1, 2, . . . , n) for the variable Var i of sensors S1 and S2 (used as reference) respectively, while median( ) is to get the median value. The RD and MdRD were used to make the comparison of difference measures between different variables easier. Unlike previous investigations in which the mean difference was calculated [5,26], this paper used the median value to alleviate the effects of outliers on the difference measures. Meanwhile, variables for TM were used correspondingly as the references and dependent variables (as "S2") in the between-sensor comparison and transformation respectively, except for the results presented in Table 2.

Transformation Models
As shown previously, to improve the continuity between different Landsat sensors in channel reflectance and NDVI (e.g., for Landsat 7 and Landsat 8 [5,42]), the linear transformation model through ordinary least squares (OLS) regression was proven applicable. Accordingly, we focused in this study on the topic of NDVI continuity between the Landsat 4-5 MSS and TM. To this end, the TM NDVI served as the response while the MSS NDVI(s) was (were) the predictor(s) (Equation (9)). Two strategies for the transformation from MSS NDVI (i.e., MSS32 and MSS42) to TM NDVI (TM43) are univariate linear model and bivariate linear model respectively. For both models, OLS regression usually used for linear model was preliminarily investigated. Furthermore, considering the multicollinearity issue, ridge regression [43] was applied to estimate the bivariate model accordingly. Moreover, the models were established to make NDVI continuity from the Landsat archive of MSS and TM, including (1) intra-platform transformation model (i.e., from MSS to TM) and (2) inter-platform transformation model (i.e., from Landsat 4 to Landsat 5).
The linear models obtained through OLS regression were acceptable for cross-sensor transformation, as shown previously [5,42]. It is worthwhile to bear in mind that suspectable estimates (with overestimated larger confidence intervals) are likely obtained by OLS regression since heteroscedasticity is usually observed in data. Meanwhile, for the bivariate model, the model estimates largely rely on the independence of the predictors. When the predictors are correlated (known as multicollinearity issue), the OLS estimates are sensitive to the errors in response (here is TM43), resulting in a large variance in model estimates. In fact, there are significantly linear correlations between the response (i.e., TM43) and the predictors (i.e., MSS32 and MSS42) ( Table 2). Thereby, ridge regression [43] as one of the solutions to address the multicollinearity problem was implemented further for the NDVI transformation model. To our knowledge, the ridge regression has not been employed previously for the between-sensor transformation.
Similar to a previous investigation [26], uncertainties as well as improvements associated with a transformation model were demonstrated through a cross-validation strategy first. Specifically, the K-fold cross-validation (K = 5) with 10,000 simulations was simultaneously performed for individual models (Equation (9)), generating a total of 50,000 validation cases for each model. For each validation case, the median relative differences (Equation (8)) after and before transformation were calculated respectively and compared accordingly. The improvements of the NDVI consistency between TM and MSS were shown through the decrease in the median relative difference. In addition, applicability of the proposed transformation models was further investigated through several case studies as listed in Table 2.

Intra-Platform Differences
Overall, compared with the approximately corresponding channels of MSS, the TM channels are characterized with spatially higher resolution and relatively narrower spectral range (i.e., for the Green and Red channels) ( Table 1) as well as a spectral shift in SRFs (Figure 1). Regarding intra-platform difference, on average the MSS NIR1 (700-800 nm) reflectance is significantly smaller (with MdD about −0.025) than the TM NIR reflectance, whereas the MSS NIR2 (800-11,000 nm) reflectance is greater (with MdD about 0.015). Meanwhile, the high degree of agreement between MSS and TM is observable over the Red channel, although MSS has a slight and negative median difference (with MdD about −0.001) ( Figure 3). As defined (Equation (5)), the NDVI relates to the reflectivity contrast between the near-infrared and the red spectral regions. Actually, reflectance biases of individual channels are important when normalized spectral indices (e.g., NDVI) are considered [19,20,26]. The fact that the reflectance for the NIR1 channel and the Red channel of MSS are significantly greater and slightly smaller than the reflectance for the corresponding TM channels respectively (Figure 3), alleviates the reflectivity contrast and results in a decrease (negative bias) in the MSS NDVI (MSS32). Conversely, the moderate positive bias for the MSS NIR2 channel enhances the reflectance contrast and generates a moderate increase (positive bias) in the MSS42. Accordingly, by comparing against TM43, a significantly  . The channels over red and NIR regions are presented, which were used for normalized difference vegetation index (NDVI) calculation. General differences are presented, including the median difference (MdD, Equation (7)) and the median relative difference (MdRD, Equation (8)). To get the difference measures, variables for TM were used as references respectively. The black dashed lines in all subplots are 1:1 lines superimposed for reference. Data density is indicated by color, with color from blue to red representing the data density from low to high.

Inter-Platform Differences
The inter-platform NDVI differences (e.g., MSS42) are generally slight when the same channel combination is used, compared against the differences resulted from different channel combinations (e.g., the Landsat 4 MSS32 versus the Landsat 5 MSS42). The inter-platform difference for TM NDVI is the least, with Landsat 4 having a MdRD of 0.42% compared against Landsat 5. On average the between-sensor (i.e., MSS and TM) difference recorded between MSS32 and TM43 is more obvious, compared with the difference between MSS42 and TM43. In particular, the inter-platform MdRDs observed for MSS32 and MSS42 are about -13.0% and 8.0% respectively. . The channels over red and NIR regions are presented, which were used for normalized difference vegetation index (NDVI) calculation. General differences are presented, including the median difference (MdD, Equation (7)) and the median relative difference (MdRD, Equation (8)). To get the difference measures, variables for TM were used as references respectively. The black dashed lines in all subplots are 1:1 lines superimposed for reference. Data density is indicated by color, with color from blue to red representing the data density from low to high. Table 3. General comparison of the NDVIs obtained through different sensors (channels) onboard Landsat 4 (L4) and Landsat 5 (L5), based on simulation results from 10,000 Hyperion spectra samples. In this table, two difference measures are presented, including the median relative difference (MdRD 1 (Equation (8)), filled with gray background) and the correlation coefficient through Spearman rank test.

Inter-Platform Differences
The inter-platform NDVI differences (e.g., MSS42) are generally slight when the same channel combination is used, compared against the differences resulted from different channel combinations (e.g., the Landsat 4 MSS32 versus the Landsat 5 MSS42). The inter-platform difference for TM NDVI is the least, with Landsat 4 having a MdRD of 0.42% compared against Landsat 5. On average the between-sensor (i.e., MSS and TM) difference recorded between MSS32 and TM43 is more obvious, compared with the difference between MSS42 and TM43. In particular, the inter-platform MdRDs observed for MSS32 and MSS42 are about −13.0% and 8.0% respectively. Consistent conclusions can be summarized for both intra-platform and inter-platform cases. The characterization differences of the approximately corresponding channels between MSS and TM, especially of the channel(s) over near-infrared region, largely contribute to the between-sensor difference in NDVI (Table 3 and Figure 3). Much more comparability among NDVI estimations was obtained when same channel combination was implemented. Meanwhile, correlation with significance between the NDVI pairs ( Table 3), suggests that linear transformation model is potentially effective to improve the NDVI continuity among the Landsat 4-5 MSS and TM observations.

Transformation Models and Comparison
Corresponding transformation models for intra-platform and inter-platform are listed in Table 4, including univariate models and bivariate models, of which the coefficients are median estimates based on cross-validation tests (see Section 2.7. Transformation Models for details). The model performance was assessed using three indicators including the MdRD, the MdD, and the mean square error (MSE). Significant decreases in the between-sensor (as MSS minus TM, using TM as reference) MdRDs are observable through the transformation models correspondingly (Tables 3 and 4, Figures 4 and 5). It appears that the bivariate models show better performance in terms of prediction capability, although the univariate models are able to improve the NDVI comparability between MSS and TM to some extent. Specifically, for Landsat 4, the MdRD of MSS32 decreased to about −1.30% from about −13.00% when the univariate model was implemented. Meanwhile, the between-sensor NDVI MdRD was further alleviated to less than 0.30% through the bivariate transformation models (Table 4 and Figure 4). Compared against the bivariate OLS regression models, the ridge regression models potentially are more effective, with a decrease about 0.12% in MdRD (Table 4, Figures 4 and 5). Furthermore, the inter-platform difference in TM NDVI (between Landsat 4 and Landsat 5) also can be alleviated through the OLS regression model, with a decrease of 0.40% in MdRD.       In summary, more significant between-sensor differences in NDVI compared with the differences in channel reflectance are illustrated using the synthesized channel reflectance. Accordingly, to improve the NDVI continuity of the Landsat 4-5 MSS and TM, transformation models were proposed and assessed through cross-validation tests.

Application Cases
Actual effectiveness of the proposed transformation models (Table 4) is required to test in terms of practical applications, in addition to the cross-validation tests mentioned above. Therefore, eight archived observation pairs, synchronously acquired by the MSS and TM onboard Landsat 5, were selected from https://earthexplorer.usgs.gov/ for case studies ( Table 2). The between-sensor differences in NDVI before and after transformation are shown (Table 5) respectively, in which two measures included are the MdD and the MdRD. The difference measures for each pair of observations are based on 10,000 randomly selected samples interpreted with best quality using QA band information (exclusively on TM). The improvements in NDVI comparability between the Landsat 5 MSS and TM observations are generally significant through transformation model (Table 5). On average, the univariate OLS regression models resulted in a decrease about 10% in MdRD. Furthermore, the bivariate models improved the NDVI comparability in most cases, with a decreased MdRD to 10% approximately, especially when the models through ridge regression were implemented. Most samples move to and locate around the 1:1 line ( Figure 6) after transformation, showing the pixel-level improvements in NDVI continuity. However, the univariate model was the only solution when data problem was observed (e.g., cases 7 and 8 in this paper, see Figure 6).
The application cases confirm the effectiveness and robustness of the transformation models through ridge regression, showing accordance with the findings based on cross-validation tests (Figures 4, 5 and 8). However, when there are problematic records in MSS observations (e.g., in NIR2 for cases 7 and 8 in Table 5), the univariate OLS regression model is the only possible solution for the NDVI continuity issue between MSS and TM.

Channel Reflectance Simulated from Hyperion Spectra
Well calibrated hyperspectral profile (i.e., Hyperion) was considered valuable in comparing the instruments with different characterizations through synthesizing the broad multispectral channels [24,25]. In this paper, the collection of 10,000 Hyperion profiles was used to simulate the corresponding channel reflectance of the Landsat 4-5 MSS and TM. The profiles used in this paper were distributed over a wide range of land surface conditions showing diversities in geography and climatology (details are presented in [23]), which largely ensure the investigation representativeness and the reliability of findings. Actually, the Hyperion archive with good calibration provides a unique source for building biome-specific spectra libraries using actual observed conditions around the world [23]. Furthermore, the case applications of the proposed transformation models demonstrate the usability of the well calibrated hyperspectral profiles (i.e., Hyperion) for issues of the sensors with multispectral broad channels, including channel simulation, between-sensor comparison, and transformation model tests.
Considering the broad channels of the Landsat 4-5 MSS and TM (Table 1) compared against the Hyperion channel provided with the FWHM of 10 nm, the simulation of channel reflectance was thought to be accurate. However, simulation with Hyperion spectra and the weighted integration (Equation (4)) possibly becomes less effective when multispectral channel provided with relatively narrow range is considered, as shown previously [24]. When a narrower multispectral channel is investigated, more other hyperspectral profile collections [44] or suitable methods [45] are necessary.

Transformation Modeling and Application
In contrast with the univariate model, the bivariate model provides a plane depicting the quantitative relation between the response (i.e., TM43) and predictors (i.e., MSS32 and MSS42). It is clear that all samples locate much closer to the planes of bivariate regression models, while parts of them deviate from the regression lines of univariate models (Figure 7). The OLS regression shows a large variance in model estimates and obvious model prediction uncertainty (error range), compared with the ridge regression (Figures 4, 5, and 8). Figure 8 illustrates that the model estimates for ridge regression are more centralized around the median correspondingly, suggesting smaller estimate

Channel Reflectance Simulated from Hyperion Spectra
Well calibrated hyperspectral profile (i.e., Hyperion) was considered valuable in comparing the instruments with different characterizations through synthesizing the broad multispectral channels [24,25]. In this paper, the collection of 10,000 Hyperion profiles was used to simulate the corresponding channel reflectance of the Landsat 4-5 MSS and TM. The profiles used in this paper were distributed over a wide range of land surface conditions showing diversities in geography and climatology (details are presented in [23]), which largely ensure the investigation representativeness and the reliability of findings. Actually, the Hyperion archive with good calibration provides a unique source for building biome-specific spectra libraries using actual observed conditions around the world [23]. Furthermore, the case applications of the proposed transformation models demonstrate the usability of the well calibrated hyperspectral profiles (i.e., Hyperion) for issues of the sensors with multispectral broad channels, including channel simulation, between-sensor comparison, and transformation model tests.
Considering the broad channels of the Landsat 4-5 MSS and TM (Table 1) compared against the Hyperion channel provided with the FWHM of 10 nm, the simulation of channel reflectance was thought to be accurate. However, simulation with Hyperion spectra and the weighted integration (Equation (4)) possibly becomes less effective when multispectral channel provided with relatively narrow range is considered, as shown previously [24]. When a narrower multispectral channel is investigated, more other hyperspectral profile collections [44] or suitable methods [45] are necessary.

Transformation Modeling and Application
In contrast with the univariate model, the bivariate model provides a plane depicting the quantitative relation between the response (i.e., TM43) and predictors (i.e., MSS32 and MSS42). It is clear that all samples locate much closer to the planes of bivariate regression models, while parts of them deviate from the regression lines of univariate models (Figure 7). The OLS regression shows a large variance in model estimates and obvious model prediction uncertainty (error range), compared with the ridge regression (Figures 4, 5 and 8). Figure 8 illustrates that the model estimates for ridge regression are more centralized around the median correspondingly, suggesting smaller estimate variance. In particular, considering the MSS42 estimates, about 60% of the estimates are located around the median varying within (−0.001, 0.001) for ridge regression, while nearly 50% of the estimates are observed for OLS regression. Accordingly, the ridge regression model is more effective to better between-sensor comparability of NDVI (Table 4, Figures 4 and 5).
transformation (e.g., for NDVI) [5,26,42]. Moreover, to get the transformed TM NDVI from MSS NDVI (i.e., MSS32 and MSS42), the bivariate model through ridge regression is more effective in terms of its prediction performance (Table 4, Figures 4,5). Compared to OLS regression, ridge regression with an added regularization term addresses the collinearity problem and reduces the variance of the estimates with improved model credibility [43], as shown in Figures 4,5. Nevertheless, as mentioned before (see Section 3.4. Application Cases for details), the univariate OLS regression model is necessarily required (even as the only alternative) when the quality problem of the MSS data products is encountered. As we all know, the Landsat series observations had been affected by many factors (https://www.usgs.gov/land-resources/nli/landsat/landsat-known-issues). Accordingly, the model alternatives are largely determined by observation quality as well as data availability in practice.  Figure 3, the color from blue to red is representing the data density from low to high.  Table 4.
Suspectable estimates (with overestimated larger confidence intervals) are likely obtained by OLS regression as heteroscedasticity is usually inherent in data. The heteroscedasticity was observed which resulted in an overestimated estimate interval of the transformation models through OLS regression in this paper. Even so, the estimate interval did not affect the model determination, showing negligible effects on final transformation results. The findings show accordance with previous studies that the linear models through OLS regression were acceptable for cross-sensor transformation (e.g., for NDVI) [5,26,42]. Moreover, to get the transformed TM NDVI from MSS NDVI (i.e., MSS32 and MSS42), the bivariate model through ridge regression is more effective in terms of its prediction performance (Table 4, Figure 4). Compared to OLS regression, ridge regression with an added regularization term addresses the collinearity problem and reduces the variance of the estimates with improved model credibility [43], as shown in Figures 4 and 5. Nevertheless, as mentioned before (see Section 3.4. Application Cases for details), the univariate OLS regression model is necessarily required (even as the only alternative) when the quality problem of the MSS data products is encountered. As we all know, the Landsat series observations had been affected by many factors (https://www.usgs.gov/land-resources/nli/landsat/ landsat-known-issues). Accordingly, the model alternatives are largely determined by observation quality as well as data availability in practice.
Increasingly archived MSS and TM observations have been processed into the Landsat Collection 1 Level-1 data product [3,4], which are readily accessible for applications (https://earthexplorer.usgs. gov/). Currently, over 0.6 million MSS and more than 2.7 million TM scenes of Landsat 5 have been freely accessible to users online (https://www.usgs.gov/land-resources/nli/landsat/landsat-5? qt-science_support_page_related_con=0#qt-science_support_page_related_con). Improvements for the MSS archive have been especially made in current version, compared against the previous MSS products (https://www.usgs.gov/land-resources/nli/landsat/landsat-1-5-multispectral-scannercollection-1). However, as indicated by the file name, all the Landsat 5 MSS data used were assigned to Tier 2 (labeled as "T2", Table 2), not meeting the Tier 1 geometry specification (https://www.usgs.gov/ land-resources/nli/landsat/landsat-collection-1). At the same time, all Landsat 5 TM images assigned to Tier 1 are considered suitable for time series analyses. Accordingly, to determine their application suitability, especially for time series analyses, the MSS scenes assigned to Tier 2 should be checked carefully. Fortunately, no significant geometry shifts were found visually between the eight pairs of the Landsat 5 MSS and TM observations (Figure 9). In this paper, pixels located within relatively homogeneous area were selected for assessments of the transformation models. Additionally, taking into account possible geo-registration uncertainty in scene-to-scene comparison (although visually undetected), the average NDVI within neighboring area (i.e., a window with 7 × 7 pixels at the MSS imagery, with approximately corresponding area covering 15 × 15 pixels at the TM imagery) around a selected sample was used as the representative thereby. The averaged NDVI values were used for the difference measures (Table 5), and are shown in Figure 6. Nevertheless, difference in spatial resolution between the MSS and TM images (Table 1) should be further investigated, especially when pixel-to-pixel application is required. Moreover, QA interpretation of the TM imagery is more reliable, although cloud shadow overestimation and clear terrain underestimation are observed ( Figure 9). Meanwhile, cloud omission as well as the absence of attributes for cloud shadow may cause interpretation unreliability of the MSS QA band. Accordingly, the QA interpretation of the TM imagery was used for determining the pixel with "clear terrain" exclusively ( Figure 9).
In summary, the occasionally synchronous observations by MSS and TM onboard Landsat 5, which were well archived, provide a good chance to validate the proposed transformation model, in addition to cross-validation tests. Findings indicate that the significance of between-sensor difference in channel characterizations should be cautioned when time series analyses based on continuous records (e.g., NDVI) is required. The initially proposed models (Table 4) should serve as the preliminary version for the NDVI continuity between MSS and TM, although they showed applicability in case applications. Impacts associated with other factors should be considered further, such as geo-registration, land use/land cover types and phenology [41], radiometric calibration [9,17,46], the BRDF (bi-directional reflectance distribution function) effects [47], and atmospheric correction which may affect the NDVI continuity between different observations [48]. In particular, for current Landsat products, the atmospheric correction resulted in different impacts on surface reflectance for different channels [5,49,50]. Moreover, interpretation of the QA band information is more reliable for the TM data product, whereas detection of the cloud and cloud shadow for the MSS data product is still challenging. Furthermore, revised transformation models are necessarily needed for actual applications when problems in data quality are considered. radiometric calibration [9,17,46], the BRDF (bi-directional reflectance distribution function) effects [47], and atmospheric correction which may affect the NDVI continuity between different observations [48]. In particular, for current Landsat products, the atmospheric correction resulted in different impacts on surface reflectance for different channels [5,49,50]. Moreover, interpretation of the QA band information is more reliable for the TM data product, whereas detection of the cloud and cloud shadow for the MSS data product is still challenging. Furthermore, revised transformation models are necessarily needed for actual applications when problems in data quality are considered. Figure 9. Interpretation for three subsets extracted from the Landsat 5 MSS and TM Quality Assessment (QA) band information: color image (R: NIR channel, G: Green channel, B: Red channel) and the interpreted information. Specifically, the QA interpretations for TM are cloud (green), cloud shadow (blue), and "clear terrain" (red), while for MSS are cloud (cyan) and clear terrain (red). Generally, QA information for TM imagery is more reliable. All subsets illustrated are for the MSS and TM pairs of Case 3 (Table 3).

Conclusions
The Landsat program with successively operational satellites (seven as of April 2019) has provided continuous Earth observation since the launch of its first satellite in 1972, generating an important archive with unmatched properties in data quality, spatial coverage, and length of the time series. Both continuity and consistency between different sensors are required for time series analyses; however, it is usually challenged by characterization differences between sensors as well as numerous malfunctions of the spacecraft. In this study, characterization differences of the Landsat 4-5 MSS and TM were investigated, mainly with focus on NDVI continuity and consistency. Specifically, a collection containing 10,000 Hyperion profiles with well calibration was used to simulate the channel reflectance of the Landsat 4-5 MSS and TM. The investigations conducted provide insights into the characterization difference between the sensors.
According to the simulation results, general agreement between the Landsat MSS and TM was observed in the Red channel, although MSS had a slight and negative median difference approximating -0.001. Meanwhile, between-sensor difference was relatively observable in nearinfrared spectral wavelength. Specifically, by comparing with the TM NIR (760-900 nm) reflectance, the MSS NIR1 (700-800 nm) reflectance had a significant negative median difference (about -0.025), whereas the MSS NIR2 (800-1100 nm) reflectance showed a positive median difference (about 0.015). The difference over the near-infrared spectrum mostly contributed to the differences between MSS NDVI and TM NDVI. The between-sensor difference (either intra-platform or inter-platform) recorded between MSS32 NDVI (NIR1 was used) and TM NDVI (TM43) was more obvious, compared against the difference between MSS42 NDVI (NIR2 was used) and TM43. In particular, the negative (about -13.0%) and positive (about 8.0%) median relative differences in NDVI were Figure 9. Interpretation for three subsets extracted from the Landsat 5 MSS and TM Quality Assessment (QA) band information: color image (R: NIR channel, G: Green channel, B: Red channel) and the interpreted information. Specifically, the QA interpretations for TM are cloud (green), cloud shadow (blue), and "clear terrain" (red), while for MSS are cloud (cyan) and clear terrain (red). Generally, QA information for TM imagery is more reliable. All subsets illustrated are for the MSS and TM pairs of Case 3 (Table 3).

Conclusions
The Landsat program with successively operational satellites (seven as of April 2019) has provided continuous Earth observation since the launch of its first satellite in 1972, generating an important archive with unmatched properties in data quality, spatial coverage, and length of the time series. Both continuity and consistency between different sensors are required for time series analyses; however, it is usually challenged by characterization differences between sensors as well as numerous malfunctions of the spacecraft. In this study, characterization differences of the Landsat 4-5 MSS and TM were investigated, mainly with focus on NDVI continuity and consistency. Specifically, a collection containing 10,000 Hyperion profiles with well calibration was used to simulate the channel reflectance of the Landsat 4-5 MSS and TM. The investigations conducted provide insights into the characterization difference between the sensors.
According to the simulation results, general agreement between the Landsat MSS and TM was observed in the Red channel, although MSS had a slight and negative median difference approximating −0.001. Meanwhile, between-sensor difference was relatively observable in near-infrared spectral wavelength. Specifically, by comparing with the TM NIR (760-900 nm) reflectance, the MSS NIR1 (700-800 nm) reflectance had a significant negative median difference (about −0.025), whereas the MSS NIR2 (800-1100 nm) reflectance showed a positive median difference (about 0.015). The difference over the near-infrared spectrum mostly contributed to the differences between MSS NDVI and TM NDVI. The between-sensor difference (either intra-platform or inter-platform) recorded between MSS32 NDVI (NIR1 was used) and TM NDVI (TM43) was more obvious, compared against the difference between MSS42 NDVI (NIR2 was used) and TM43. In particular, the negative (about −13.0%) and positive (about 8.0%) median relative differences in NDVI were observed for MSS32 and MSS42 respectively, using TM43 as reference. Meanwhile, the inter-platform median relative difference for TM NDVI was minor, with the Landsat 4 TM having slightly higher NDVI than the Landsat 5 TM. Consequently, to make the NDVI consistency between different observations, univariate and bivariate models for transforming MSS NDVI to TM NDVI were proposed respectively. Findings based both on the cross-validation tests and the case applications with eight pairs of the Landsat 5 MSS and TM scenes, suggested that the bivariate models especially through ridge regression were more effective in NDVI transformation for intra-platform and inter-platform cases. Meanwhile, the univariate model through OLS regression could be the only solution for cases when problems of data quality are encountered (e.g., the problem in MSS NIR2), which provides moderate but effective contributions to alleviate the between-sensor difference.
In conclusion, the findings on NDVI transformation models from MSS to TM are valuable for reference, because of the collection of diverse Hyperion hyperspectral profiles, although further improvements are necessary for practical application.