Elaboration of Simulated Hyperspectral Calibration Reference over Pseudo-Invariant Calibration Reference
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsAccurate hyperspectral simulations are critical for the vicarious calibration of space-based sensors. This study presents a refined methodology to generate simulated radiometric calibration references (RCRs) over bright desert pseudo-invariant calibration sites (PICS). Key advancements include improved surface reflectance modelling using the RPV model and CISAR algorithm, enhanced atmospheric property characterization from multiple state-of-the-art datasets, and the use of the Eradiate Monte Carlo-based radiative transfer model.
Here are several suggestions and questions that the authors need to address and answer.
- The formula of the RPV model should be clearly written out.
- The surface reflectance data in the satellite observation direction is derived from the RPV model, which is obtained through atmospheric correction of satellite data. How to ensure the accuracy of the surface reflectance data?
- Which satellite data are used to invert the parameters of the RPV model? These satellite data are from multi-spectral sensors. How to eliminate the spectral and radiometric inconsistencies among sensors?
- The coverage times of the satellite data used to establish the RPV model do not overlap. How was this considered?
- Do the five best - fitting spectra from MARMIT conform to the spectral characteristics of the surface when the satellite passes over? If the spectral situations are inconsistent, how to adjust?
- There are mainly three improvements in the article, including improving surface reflectance modeling using the RPV model and CISAR algorithm, enhancing the characterization of atmospheric properties from multiple advanced datasets, and using the Eradiate radiative transfer model based on the Monte Carlo method. Compared with the original calibration method, how much do these refinements reduce the calibration uncertainty?
Author Response
We would like to thank the reviewer for the comments and suggestions to further improve our publication. Please find below our answers:
- The formula of the RPV model should be clearly written out.
Thank you for this comment. This has been added to the manuscript.
2. The surface reflectance data in the satellite observation direction is derived from the RPV model, which is obtained through atmospheric correction of satellite data. How to ensure the accuracy of the surface reflectance data?
The CISAR algorithm has been applied to several sensors, examples of validation against MODIS can be found in Luffarelli and Govaerts, 2019 (https://amt.copernicus.org/articles/12/791/2019/). Nevertheless, the RPV are here retrieved from a series of satellite, using a reliable source of AOD prior information to facilitate the inversion and improve the reliability of the RPV parameters. Finally, the high accuracy of the surface reflectance obtained with the proposed method is confirmed by the low bias and standard deviation against observations, especially at longer wavelengths, where the effect of the atmosphere is reduced.
3. Which satellite data are used to invert the parameters of the RPV model? These satellite data are from multi-spectral sensors. How to eliminate the spectral and radiometric inconsistencies among sensors?
The following instruments were used for Libya-4, as detailed in Table 2 of the manuscript:
- MODIS (AQUA and TERRA)
- Sentinel-2 MSI (S2A)
- Sentinel-3 OLCI (S3A/B)
- SEVIRI onboard MSG-4
- POLDER onboard PARASOL
Each sensor is, however, processed independently, i.e. no prior harmonization or homogenization is performed. CISAR is designed to handle sensor-specific differences by explicitly incorporating radiometric and geometric uncertainties into the inversion process, and accounting for the specific sensor response function.
4. The coverage times of the satellite data used to establish the RPV model do not overlap. How was this considered?
This study relies on the assumption that the selected targets are Pseudo Invariant Calibration Sites with stable surface reflectance. Hence, the different temporal coverage of the satellite acquisition does not impact the retrieval of the RPV parameters.
5. Do the five best - fitting spectra from MARMIT conform to the spectral characteristics of the surface when the satellite passes over? If the spectral situations are inconsistent, how to adjust?
A test is implemented to ensure that the distance between the best-fitting spectra and the retrieval, at each available wavelength processed with the CISAR algorithm, is lower than 0.2. We have never observed spectral inconsistencies in the selected spectra, which would otherwise have suggested possible issues in the CISAR retrieval and/or an inappropriate choice of the reference dataset.
6. There are mainly three improvements in the article, including improving surface reflectance modeling using the RPV model and CISAR algorithm, enhancing the characterization of atmospheric properties from multiple advanced datasets, and using the Eradiate radiative transfer model based on the Monte Carlo method. Compared with the original calibration method, how much do these refinements reduce the calibration uncertainty?
Thank you for this comment. It would indeed be interesting to quantify separately the impact of each source of improvement. Unfortunately, this would require additional work, beyond the resources allocated for this study and – regardless, beyond the time allowed by the journal for this revision. However, the impact of the improved atmospheric profile characterisation is analysed in Leroy et al., 2024 (https://www.tandfonline.com/doi/full/10.1080/22797254.2024.2389798).
Reviewer 2 Report
Comments and Suggestions for AuthorsThe study aims to develop an improved method for generating simulated radiometric calibration references (RCRs) on bright desert pseudo-invariant calibration sites (PICS). After reviewing the manuscript, I believe the overall research content is feasible. However, I have the following suggestions:
1.What is the purpose of selecting two experimental areas?
2.Are there any criteria for selecting satellite datasets?
3.The central wavelengths or spectral ranges of the satellites should be annotated in Table 2, along with basic information such as spatial resolution. Similarly, Table 3 should also include spatial resolution and other relevant details.
4.In Figure 2, it is stated that SAZ = 15°, but the explanation mentions SZA as 10°.
5.The 12 blue points in Figure 3 are derived from which satellite? If two satellites have identical band settings, how is the final DHR value obtained?
6.In line 245, the role of "3.Inversion" is not clear. Since the curves in Figure 3 are already interpolated, why is interpolation performed again here?
7.Both modeling and validation data are derived from Table 2, which may compromise the reliability of the results. It is recommended to use other datasets for validation to enhance the credibility of the study.
8.The manuscript repeatedly uses summary titles such as "3.3.2 Summary" and "3.3.3 Summary." These titles should be more specific to accurately reflect the content of each section.
9.In multispectral validation, the error in Gobabeb is obviously larger than that in Libya. Can the algorithm be improved in the future to enhance precision and further demonstrate the superiority of the proposed method?
10.line551,"this study demonstrates a substantial improvement in simulation stability, with the standard deviation of the accumulated bias reduced from 1.7% to 0.7%......" How are these numerical values calculated?
Author Response
The authors would like to thank the reviewer for the constructive comments to improve the manuscript. Please find below our reply:
The study aims to develop an improved method for generating simulated radiometric calibration references (RCRs) on bright desert pseudo-invariant calibration sites (PICS). After reviewing the manuscript, I believe the overall research content is feasible. However, I have the following suggestions:
1.What is the purpose of selecting two experimental areas?
Libya 4 is chosen as a reference for comparing our results with previous efforts, giving its extensive use in vicarious calibration purposes, while Gobabeb is a new site with increasing relevance for vicarious calibration applications; we believe that is relevant to compare our methods (i.e. purely based on RTM simulations) with the study from De Vis et al., 2024.
The text has been updated as follows:
“The targets selected for this study are Libya-4 and Gobabeb. Libya-4 was chosen due to its extensive historical use as a bright desert PICS, making it an essential reference point for validating and comparing results with previous calibration efforts. Gobabeb, part of the HYPERNETS project [18] and of the CEOS initiative Radiometric Calibration Network (RadCalNet) [19] is a more recent with the distinct advantage of providing in-situ measurements.”
2.Are there any criteria for selecting satellite datasets?
The following text has been added: “selected based on the availability of observations over the selected PICS and on the quality of their radiometric characterisation”.
3.The central wavelengths or spectral ranges of the satellites should be annotated in Table 2, along with basic information such as spatial resolution. Similarly, Table 3 should also include spatial resolution and other relevant details.
Adding additional columns in Table 2 and 3 would strongly affect the readability of the tables. However, we have added the following text:
The selected instruments acquire observations in the 350 – 2500nm spectral range with a spatial resolution ranging from few metres to about 7km.
4.In Figure 2, it is stated that SAZ = 15°, but the explanation mentions SZA as 10°.
Thanks for noticing this mistake, it has been corrected in the text. We confirm that the SZA is set at 15º.
5.The 12 blue points in Figure 3 are derived from which satellite? If two satellites have identical band settings, how is the final DHR value obtained?
The following text has been added:
The blue dots represent the DHR derived from the CISAR and the associated uncertainties; in spectral regions where multiple CISAR retrievals are available, the DHR is computed as the mean value, where the uncertainty is calculated as in Eq. 1.
6.In line 245, the role of "3.Inversion" is not clear. Since the curves in Figure 3 are already interpolated, why is interpolation performed again here?
No interpolation is performed again at this stage. Here, the interpolated BRF as obtained at point 2 is inverted to retrieve the RPV parameters at 1nm resolution.
7.Both modeling and validation data are derived from Table 2, which may compromise the reliability of the results. It is recommended to use other datasets for validation to enhance the credibility of the study.
Thank you for your comment. We agree that showing results against other multispectral instrument could improve the reliability of the results, but unfortunately this has not been possible in the framework of this study due to resource constraints. However, as the method is based on an ensemble of instruments, it is not directly linked to one particular sensor. Also, we are presenting results against hyperspectral data, which have not been used for the modelling.
8.The manuscript repeatedly uses summary titles such as "3.3.2 Summary" and "3.3.3 Summary." These titles should be more specific to accurately reflect the content of each section.
Section 3.3.2 has been renamed to “Performance assessment of the simulated RCR with multispectral sensors” and section 3.3.3 to “Evaluation of the simulated RCR with hyperspectral observations”
9.In multispectral validation, the error in Gobabeb is obviously larger than that in Libya. Can the algorithm be improved in the future to enhance precision and further demonstrate the superiority of the proposed method?
The limitation over Gobabeb compared to Libya4 are discussed in lines 451 to 458. Furthermore, the limited spatial extent compare to Libya4 makes the site more susceptible to spatial heterogeneity and to contamination from surrounding terrain features (this sentence has been added in the text). The simulations could improve, for instance, by using the in-situ aerosols measurements; however we wanted to compare exactly the same setup as over Libya-4 for the purpose of this study. Additonal research on possible improvements to the method will be performed provided appropriate resources.
10.line551,"this study demonstrates a substantial improvement in simulation stability, with the standard deviation of the accumulated bias reduced from 1.7% to 0.7%......" How are these numerical values calculated?
The standard deviation is calculated over all available data points. The value of 1.7% is reported in Misk et al. 2022 (Table 6.1). This has been clarified in the text.
Reviewer 3 Report
Comments and Suggestions for AuthorsThis paper is a nice evaluation of a new vicarious validation workflow for satellite imagery over two PICS sites. Its foundation is mainly on previous work done by the authors where they are incrementally going a step further in their developments - some broader foundation would be desirable. Valid results are given for both multispectral and hyperspectral sensors. The improvements over currently available alternative methods are note clearly given; a direct comparison to other workflows based on the same dataset would have been interesting to see, as the authors state a signficicant improvement in accuracy. For the hyperspectral case, a direct and traceable comparison is missing and it remains somewhat unclear how reliable the given results are. However, the given statistical analyses and plausibility tests point towards a reliable method, worth being published.
Some detail comments:
- The abstract already contains lots of a abbreviations, e.g RPV, CISAR which should be avoided or explained.
- line 59: influences of aerosols are not only very signficant for low reflectance but also for bright targets as in some bright PICS.
- page 3: the paper highly relies on the Eradiate model - a short description and qualtitive comparison to 'standard' models such as 6SV, MODTRAN6, or Libradtran should be given.
- page 4: what's a 1D surface reflectance model? - I would understand a single spectrum...
(can't find it in the ref [7]). - page 5: reference 23 and 24 are the same...
- page 5: the CISAR algorithm iterates on aerosol amounts but does not optimize the water vapor - this has quite some impact on the accuracy of hyperspectral validation. This should be mentioned.
- Figure 4: the uncertainty given in this figure is not realistic for the fully absorbing water vapor bands (where it typically goes to really high numbers) some interpolation may have happened - these ranges should be omitted.
- page 11: it's not becoming completely clear, what aerosol parameters are optimized with the CISAR algorithms and which ones are kept constant based on aeronet. Pls clarify.
- page 12: The improvement from 1.7 to 0.7% of accumlated bias is shown as a main achievement of the method. I can't find the foundation or prove for this numbers from the given data in figure 8 or elsewhere in the paper.
- page 15: '.. findings are particularly interesting ' - what's interesting here and why?
- page 17/18: figures 12 and 14 are very important and could be shown in more detail (while 11 and 13 don't show that much).
- There's a high correlation between EMIT and Enmap bias values - one would not expect this. Could it be that there's a systematic problem with the RTM simulations or the surface reflectance assumptions?
- pls mention the number of averaged pixels - 'only 6 observations' may be many more if each pixel is considered a measurement.
- page 20: 'higher spectral resolution RTMs' are mentioned - what resolution would be recommended then and what was the resolution here?
- The conclusions talk about a significant step forward and again use the 0.7 percent bias variation as a main argument. However no direct comparison to other methods are given. Thus, one should be more cautious in rating the results.
Author Response
We would like to thank the reviewer for the insights and additional comments. We believe that the manuscript is now improved following the reviewer’s suggestions. Please find below our reply:
This paper is a nice evaluation of a new vicarious validation workflow for satellite imagery over two PICS sites. Its foundation is mainly on previous work done by the authors where they are incrementally going a step further in their developments - some broader foundation would be desirable. Valid results are given for both multispectral and hyperspectral sensors. The improvements over currently available alternative methods are note clearly given; a direct comparison to other workflows based on the same dataset would have been interesting to see, as the authors state a signficicant improvement in accuracy. For the hyperspectral case, a direct and traceable comparison is missing and it remains somewhat unclear how reliable the given results are. However, the given statistical analyses and plausibility tests point towards a reliable method, worth being published.
We thank the reviewer for the positive assessment and constructive suggestions. While this work builds upon previous developments, it introduces significant refinements in surface and atmospheric characterization and uses the Eradiate RTM for the first time at this level of detail. Regarding hyperspectral validation, we acknowledge the current limitation in direct SI-traceable comparisons and we hope to compare our results with TRUTHS observations, once they will become available.
- The abstract already contains lots of a abbreviations, e.g RPV, CISAR which should be avoided or explained.
We have removed and/or explained the abbreviations in the abstract.
- line 59: influences of aerosols are not only very signficant for low reflectance but also for bright targets as in some bright PICS.
There might be a misunderstanding here. The manuscript reads: “Over bright deserts, surface reflection typically dominates radiative processes. However, aerosol scattering can also be significant, especially in the visible spectral region, where surface reflectance is relatively low”. We do agree that aerosols can have a significant impact also over bright desert, but mostly in the visible spectral region.
- page 3: the paper highly relies on the Eradiate model - a short description and qualtitive comparison to 'standard' models such as 6SV, MODTRAN6, or Libradtran should be given.
The following text has been added: “Eradiate allows, among other, a spherical shell geometry, overcoming the plane-parallel atmosphere assumption. Future versions of the model will also include polarization, which is expected to improve the simulations especially at low wavelengths.”
More detailed comparison against standard models will be presented in future publications by Dr. Vincent Leroy.
- page 4: what's a 1D surface reflectance model? - I would understand a single spectrum...
(can't find it in the ref [7]).
The following has been added to the manuscript “The 1D assumption on the horizontal homogeneity and flatness of the surface neglects any lateral variability in surface properties (e.g., dune shapes, stone patches), and assumes that the reflectance depends solely on geometric configuration, not spatial heterogeneity.”
- page 5: reference 23 and 24 are the same...
Thank you, this has been fixed.
- page 5: the CISAR algorithm iterates on aerosol amounts but does not optimize the water vapor - this has quite some impact on the accuracy of hyperspectral validation. This should be mentioned.
The following has been added in Section 2.2.3: “It should be noted that while CISAR performs an iterative inversion of aerosol optical properties, it does not currently optimize TCWV during the retrieval. Instead, TCWV is a model parameter, taken from CAMS reanalysis. This may introduce additional uncertainty, especially in the retrieval of surface reflectance in water vapour absorption bands.
- Figure 4: the uncertainty given in this figure is not realistic for the fully absorbing water vapor bands (where it typically goes to really high numbers) some interpolation may have happened - these ranges should be omitted.
This uncertainty is calculated from the inversion of the BRF with the tool provided by JRC. No further interpolation is performed. Regardless, we removed this figure, following the suggestion of the reviewer.
- page 11: it's not becoming completely clear, what aerosol parameters are optimized with the CISAR algorithms and which ones are kept constant based on aeronet. Pls clarify.
CISAR is here only used for the retrieval of the RPV parameters, it is not used to obtain the aerosol model as described in Section 2.3
- page 12: The improvement from 1.7 to 0.7% of accumlated bias is shown as a main achievement of the method. I can't find the foundation or prove for this numbers from the given data in figure 8 or elsewhere in the paper.
The standard deviation is calculated over all available data points. The value of 1.7% is reported in Misk et al. 2022 (Table 6.1). This has been clarified in the text.
- page 15: '.. findings are particularly interesting ' - what's interesting here and why?
The text has been updated as follows: “These findings are noteworthy because they show that the simulation-based approach proposed in this study yields comparable bias patterns to those obtained in De Vis et al., 2024, using in situ surface reflectance from the LANDHYPERNET station at Gobabeb and propagated it to TOA using the 6SV radiative transfer model.”
- page 17/18: figures 12 and 14 are very important and could be shown in more detail (while 11 and 13 don't show that much).
The following text has been added in section 3.3.1:
“While the overall trend confirms that biases remain low (typically within +-3% ) in high-transmittance regions, the figure also reveals that the results are not necessarily consistent across instruments. However, some correlation can be observed between EMIT and Enmap in the SWIR range, which is affected by numerous narrow atmospheric absorption features. This underscores the importance of accurate gas absorption modelling at high spectral resolution and support the need for future refinement in atmospheric characterization, particularly for hyperspectral calibration in the shortwave infrared.”
And in Section 3.3.2:
“While the general trend is consistent with Libya-4, with lower bias in high-transmittance regions and increased deviations near strong absorption features, the overall agreement appears less stable and more variable compared to Fig. \ref{fig;hyperspectral_bias}. This reflects the additional challenges of validating over a smaller and more heterogeneous target, where spatial mismatches and background contamination can influence sensor performance.”
- There's a high correlation between EMIT and Enmap bias values - one would not expect this. Could it be that there's a systematic problem with the RTM simulations or the surface reflectance assumptions?
EMIT and Enmap associate bias show high correlation only in the SWIR. This region is strongly affected by atmospheric absorption, which could indeed suggest a systematic problem. In other part of the spectrum the results among instruments (including PRISMA) and among targets are not consistent enough to draw conclusions on possible systematic problems.
- pls mention the number of averaged pixels - 'only 6 observations' may be many more if each pixel is considered a measurement.
Thank you for your comment. We would like to clarify that although each satellite scene may contain many individual pixels, all observations are spatially aggregated over the extent of the selected target areas (20 km for Libya-4 and 2 km for Gobabeb). As such, only one aggregated observation per overpass is considered as a single measurement in the analysis. Therefore, when we refer to “only 6 observations,” we specifically mean 6 aggregated, target-level overpasses, each representing a spatially averaged reflectance over the full calibration site.
- page 20: 'higher spectral resolution RTMs' are mentioned - what resolution would be recommended then and what was the resolution here?
These results are obtained at 1nm resolution, as explained in section 3.1 (and now reminded in the discussion). We would like to refrain from making recommendations before having the possibility of applying our method at higher spectral resolutions.
- The conclusions talk about a significant step forward and again use the 0.7 percent bias variation as a main argument. However no direct comparison to other methods are given. Thus, one should be more cautious in rating the results.
Thank you for this comment. We agree that comparative assessments are essential for substantiating methodological claims. This study builds on the heritage of past work, which has been used, among others, to refine S3A/SLSTR calibration. We thus considered relevant comparing our new method with previous work, hence the reference to the 0.7% standard deviation is intended as an internal consistency indicator. The purpose of including this value is to highlight the improved stability achieved through refined surface and atmospheric characterization and the use of the Eradiate RTM — rather than to make a definitive performance claim over alternative methods. An improved explanation on how this value is calculated has been added in the manuscript thanks to the reviewers’ comments.
Reviewer 4 Report
Comments and Suggestions for AuthorsSee attachment
Comments for author File: Comments.pdf
Author Response
We would like to thank the reviewer for the insights and additional comments. We believe that the manuscript is now improved following the reviewer’s suggestions. Please find below our reply:
The authors suggest using RDRs over PICS as generally accepted photometric standards for VIS/NIR. They describe some refinements to the previously employed methodology. By and large they make a convincing case, and the manuscript is a useful summary of a mature technique for (inter-)calibration and monitoring of sensors.
In the following I list two major (in bold) and some minor suggestions for improvements.
1. Line 9: Given the fact that not everybody reading the abstract is necessarily an expert on reflectance modelling, I would explain the abbreviations “RPV” and “CISAR”.
We have removed and/or explained the abbreviations in the abstract (see the reply to reviewer#3 comment 1)
2. Lines 49 - 52: It is not clear what limitations the authors refer to. In my opinion one cannot claim that intercalibration is not applicable to early Earth observation. There are methods different from PICS available, for example based on random observations of the Moon.
The text refers to the intercalibration method where one reference radiometer is used for the calibration of another radiometer. These lines do not discuss the different types of vicarious calibration methods, like using the moon.
3. Line 149: “typically > 100 ?m” seems like a strange characterization of the size of stones and dunes.
100 ?m is roughly the upper limit where Mie theory can be used to characterize the scattering of radiation with particles. Above this limit, geometrical optics should be used.
4. Line 152: Maybe describe the reflectance model with one sentence instead of one word (1D).
The sentence has been modified and reads now:
In practice, PICS targets are assumed to be perfectly uniform and flat, allowing their surface properties to be represented by a 1D surface reflectance model like the Rahman Pinty Verstraete (RPV) model used by [7].
5. Line 164: Explain what a “hot spot due to medium porosity” is.
The medium porosity indicates the amount of void filled by air where shadowing affects can take place. The hot spot peak that appears in the backward scattering direction and due to shadow hiding opposition effect, is therefore shaped by these shadowing effects within the microstructure of the surface.
6. Figure 1: Explain why only the longitudinal profile is shown. (Because of the orientation of the dunes?)
Yes, because of the orientation of the dunes.
7. Line 189: Is ? different from Θ in line 162?
The typo has been corrected.
8. Equation (2): This equation does not look right. ??(??)/??? in the first term after the equal sign should be squared. The expression “pj(pj≠|pi” below the last sum sign is incomplete. “???2 ???2 ” should be “??????”. For a justification of these corrections, see “Taylor J., 1997, An Introduction to Error Analysis: The Study of Uncertainties in Physical Measurements”.
Thank you for pointing out this mistake. We confirm that the formula is correctly implemented in our code. The manuscript has been corrected.
9. Line 220: The documentation of comet_maths is not accessible. The link at https://www.comet-toolkit.org/tools/comet_maths/ does not work. Give another source of information about this tool.
The URL https://www.comet-toolkit.org/tools/comet_maths/ is the correct link and works fine.
10. Figure 5: Obviously the distribution is not symmetric. Therefore, it should not be fitted with a normal distribution but with another, e. g. a Poisson distribution. One should not simply exclude data that do not fit the model.
Thank you for this observation. We agree that the distribution of the coarse mode radius is asymmetric and that the normal distribution may not be the most representative model. While the normal distribution was initially chosen for its simplicity and interpretability, we acknowledge that alternative distributions would better capture the positively skewed nature of the data. We have tested the use of both lognormal and gamma distributions and confirmed their better fit to the data. However, we emphasize that the purpose of this analysis is not to model the full physical distribution of particle sizes, but rather to derive a robust estimate of the standard deviation to be used as uncertainty input for the radiative transfer model. In this context, our aim is to remove extreme tails that may arise from measurement errors or unrepresentative retrieval conditions. We have revised the caption of Figure 5 and the relevant text in Section 2.3.2.
11. Line 300 and 301: Maybe the tails of the histograms are not outliers but a valid indication of high uncertainty.
Indeed, the presence of extended tails may reflect real atmospheric variability and should not automatically be considered as noise or error. However, in this study we are focused on the derivation of representative input parameters and associated uncertainties to be used in radiative transfer simulations, not on the full characterization of aerosol variability. To avoid overestimating the uncertainty due to occasional extreme values, we have revised our approach. To ensure that we do not exclude more than 5% of the data, we now exclude the top and bottom 2.5% of the data. This exclusion does not imply that the tails are invalid, but rather aims to reduce their disproportionate influence on the standard deviation used as uncertainty input. This methodological choice has been clarified in the revised manuscript.
12. Figures 5 and 6: It is not clear, which one of the two figures reflects the size distribution correctly.
Figure 5 shows the size distribution of the coarse mode as retrieved from the selected AERONET stations. Figure 6 shows the cumulative size distribution (fine+coarse) as estimated with the proposed method.
13. Line 361: It is not quite clear how the standard deviation of the bias was determined. Is its weight based on the size or the uncertainty of the bias?
The correction is weighted by the inverse of the standard deviation.
14. Line 415: Replace “3” with “3.1”.
Done
15. Figure 9: 3% is not the estimated accuracy of the radiative transfer model but the limit for applying calibration corrections.
This limit is based on the uncertainty of the radiative transfer modelling.
16. Figure 15 and 16: Explain what the rectangles mean.
The text and legends have been clarified and read now:
To better understand the role of atmospheric transmittance, Figures 14 and 15 correlate observed biases with molecular transmittance. These box-and-whisker plots show statistical aggregates for transmittance bins with a statistically significant number of data points occurring for transmittance values larger than 0.65. The shaded boxes show Q1, Q2 and Q3, whiskers range to the last data point within 1.5 times interquartile range and therefore exclude outliers, and outliers are drawn with small circle markers. For transmittance values smaller than 0.65, where bins have too few data points for a box-and-whisker plot to make sense, data points are all displayed with circle markers.
17. Reference [17]: What journal or book is this article in?
It is a git depository. The URL has been added.
18. Reference [20]: The authors are ESA, Airbus, and Copernicus?
The authors have been corrected. It is ESA and EU.
19. Reference [37]: Who is Y.G.?
The list of authors has been updated.
20. Reference [45]: Web page was not found.
The URL has been corrected in the reference.
21. References [46] and [47]: web address is written twice.
Reference 46 has been removed because it was the same as reference 37. Reference 47 has been corrected.