Review Reports - Geostatistical Reconstruction of Atmospheric Refractivity Fields Using Universal Kriging

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

I recommend Major Revision because the manuscript addresses an operationally important geophysical field (near-surface atmospheric refractivity) and uses a sensible geostatistical idea (Universal Kriging with an elevation-dependent drift) on a relatively dense AWS network, but the current version does not yet demonstrate methodological novelty beyond a well-known UK formulation, does not sufficiently justify key physical/statistical assumptions (notably the use of a single linear vertical drift across complex coastal–mountain terrain and rapidly evolving humidity regimes), and relies on a validation design that is too limited to support the broad claims of robustness and applicability.

The central methodological choice—Universal Kriging with a deterministic drift and a residual variogram—is standard geostatistics, and while the paper is well written, the manuscript currently reads as an implementation study rather than a methodological advance, so the authors should either reposition the contribution explicitly as an applied case study with clearly bounded scope or add a genuinely stronger technical contribution (e.g., a space–time kriging formulation, an anisotropic/nonstationary covariance model, or a demonstrable operational integration into propagation metrics) and critically compare against more than Ordinary Kriging to establish what is actually gained in practice beyond “UK is better when there is an elevation trend.”

A single linear altitude-dependent drift (estimated by OLS at each 10-min epoch) is an appealing first-order model, but in a coastal region with sharp humidity gradients, complex orography, and strong diurnal boundary-layer transitions, the vertical structure of refractivity is not guaranteed to be spatially homogeneous or linear with elevation, so the manuscript needs a much stronger justification (or relaxation) of this assumption, including diagnostics showing when and where the linear drift is adequate, how often it fails, and whether a more flexible mean structure (e.g., elevation + distance-to-coast + latitude/longitude terms, piecewise elevation trends, or regime-dependent drift) materially changes the mapped fields and the reported UK–OK skill differences.

The study uses high-frequency (10-min) AWS time series over weeks, yet the kriging is effectively presented as a sequence of independent spatial interpolations with epoch-wise drift estimation, which overlooks that refractivity (especially its wet term) has strong temporal autocorrelation and coherent mesoscale evolution, so the authors should either justify the “independent time-slice” approach as appropriate for the intended application or extend the framework to account for time explicitly (space–time variograms/kriging, temporal filtering of drift estimates, or regime-conditioned models), and in either case demonstrate that the inferred variogram parameters and prediction errors are stable in time rather than being artifacts of episodic events or unmodeled time dependence.

Residual variograms are fitted with standard isotropic models and the exponential model is selected largely by visual fit/weighted least squares, but for atmospheric fields over complex terrain, residual structure can be anisotropic (e.g., aligned with coastlines, valleys, prevailing flow) and potentially nonstationary, so the paper should include clear evidence that isotropy and stationarity are reasonable for the detrended residuals (directional variograms, anisotropy tests, stability across subdomains, and comparison of alternative covariance structures), and should quantify uncertainty in variogram parameter estimates because kriging performance and “map texture” can be highly sensitive to nugget/range choices, especially when residual variability is amplified under certain stratification regimes.

The current validation relies heavily on three independent control stations at different elevations, which is useful for illustrating altitude effects but is not sufficient to characterize domain-wide predictive performance, spatial bias patterns, or failure modes, so the manuscript should add a more comprehensive validation strategy (e.g., repeated k-fold spatial cross-validation, block CV to avoid spatial leakage, performance stratified by elevation bands and coastal/inland sectors, and error maps rather than only station time series) and should report uncertainty and distributional diagnostics (bias, quantiles, outlier frequency) rather than only RMSE/MAE, because the practical impact on propagation assessment is often driven by tails and gradients rather than mean errors.

The paper motivates the work with super-refraction and ducting risk, but the results stop mainly at refractivity maps and gradient discussion, so to make the study compelling for atmospheric/propagation applications the manuscript should quantify how the reconstructed fields translate into propagation-relevant diagnostics (e.g., refractivity gradient exceedance frequencies, ducting indices, spatial extent of regimes, or demonstration via a simple parabolic-equation/ray-tracing sensitivity test), and should show whether UK materially improves these downstream diagnostics compared with simpler approaches, because a modest RMSE reduction may or may not change regime classification or operational decisions.

For reproducibility, the manuscript should more clearly specify key implementation choices that can materially affect outputs, including the interpolation grid resolution used for map generation, how missing AWS records are handled at each epoch, whether variograms are fitted per-epoch or per-regime using pooled residuals, how lag bins and maximum ranges are chosen, and whether any quality control is applied to sensor outliers or inconsistent humidity/temperature readings that can strongly perturb refractivity.

The discussion of RMSE versus MAE is generally correct, but the paper should go further by explicitly characterizing the error distribution (e.g., skewness/outliers and when they occur), linking large residual episodes to meteorological situations (fronts, nocturnal inversions, coastal moisture surges), and explaining whether those large-error cases correspond to precisely the anomalous propagation regimes that the method aims to support, because that is where users care most about reliability.

Because the approach is demonstrated in a specific region with a particular AWS density and topographic complexity, the manuscript should more explicitly discuss transferability, including what station density is required for stable variogram estimation, how performance degrades under sparser networks, and how the method should be adapted in regions where elevation is not the dominant driver (e.g., strong coastal gradients at near-constant elevation), which will make the conclusions more credible and appropriately bounded.

Author Response

Please see the attachment

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

Geostatistical Reconstruction of Atmospheric Refractivity Fields Using Universal Kriging

The paper is generally well written, but the following comments should help revision.

L93: Terminology issue: The ‘dry’ term includes dry pressure (Pd), not the total pressure P (which is Pd + e).
L105: remove the negative sign because of the word ‘decreases’
Why assume linear drift rather than polynomial or exponential as pressure, for example, is exponential?
L125: specify which station is it (e.g., give lat-lon coordinates or indicate on a map)
Sect 3.1, L152: Explain briefly how lambda is related to gamma.
L181-182: what about other f_k values, how are they determined?
Sect 3.4 or under methods: Explain briefly what software was used to do the Kriging interpolation.
Fig 5, Step 4. Suggestion: Better to not show the global map (because this is regional study) and still better to not show any map here.
L217: ‘Improved’ with respect to what?
Fig 6: Consider using some lighter color (e.g., gray or silver) for the representative stations.
L273-274: Incorrect: colors are reversed
9: Rather than repeating pink dots, it would be better to plot the directly computed refractivity or residual values at these stations. This would facilitate comparison with the model estimates.
L317: It may be better to call these stations as “test” (or another term) rather than “control.”
L325-27: Would the elevation pattern remain the same if other stations were selected as test locations?
L329: what is ‘small’?
L333-34: No correlation metrics is provided. Correlation does not ‘confirm.’

Author Response

Please see the attachment

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

This manuscript presents a geostatistical framework for the reconstruction of atmospheric refractivity fields using Universal Kriging (UK) with an altitude-dependent drift. The topic is quite relevant and the paper is generally well-structured, clearly written, and supported by a comprehensive dataset derived from a relatively dense network of automatic weather stations.

One of the main strengths of the work lies in the physical consistency of the proposed approach. The introduction of a deterministic drift linked to elevation is well justified, the methodology is also clearly described, and the workflow is easy to follow.

However, despite these strengths, the overall contribution of the manuscript appears somewhat incremental. The use of UK with an external drift based on elevation is well established in the geostatistical literature, and similar approaches have already been applied to atmospheric and (geo)physical variables. While the application to refractivity in the Galician region is valuable, the manuscript would benefit from a clearer positioning with respect to the state of the art, explicitly highlighting what methodological or practical advances are achieved beyond existing studies.

A more critical issue concerns the modelling of the deterministic component. The vertical refractivity gradient is assumed to be linear with respect to altitude, which is a reasonable first-order approximation, but may be overly simplistic under certain atmospheric conditions, particularly in the presence of strong stratification or ducting phenomena. This limitation should at least be discussed more explicitly, and, if possible, supported by a sensitivity analysis.

The variogram analysis, although adequate at a descriptive level, lacks a rigorous quantitative comparison between candidate models. The validation strategy also raises some concerns. The use of three independent control stations at different elevations provides useful insights into the vertical performance of the model, but it remains limited in terms of spatial representativeness. A more comprehensive validation framework, such as leave-one-out or k-fold cross-validation, would allow for a more robust assessment of the interpolation accuracy across the entire domain.

The manuscript is generally clear, although some sections of the introduction could be slightly condensed to improve readability. Minor inconsistencies in notation are present, particularly in the use of symbols for the vertical gradient, and should be harmonised. Some figures could also benefit from improved readability, for instance, through larger labels (Fig.8) or more consistent colour scales (Fig. 9).

In conclusion, the manuscript presents a solid and well-executed application of UK to atmospheric refractivity mapping, with clear practical relevance and a sound methodological basis. Nevertheless, its level of novelty is moderate, and several aspects—particularly the validation strategy, the variogram modeling, and the treatment of the deterministic drift—would benefit from further clarification or extension.

For these reasons, I recommend publication after moderate revisions, addressing the points raised above to strengthen the overall contribution and robustness of the study.

Author Response

Please see the attachment

Author Response File: Author Response.pdf

Reviewer 4 Report

Comments and Suggestions for Authors

Overall, the manuscript is well-structured and clearly written, providing a thorough description of the dataset and the proposed methodology. However, several limitations related to the validation strategy and the lack of statistical tests to rigorously assess the improvements of Universal Kriging over Ordinary Kriging reduce the strength of the conclusions. Moreover, the study appears to be primarily a case study rather than a presentation of a novel methodological approach, and it remains somewhat unclear what the specific contributions are with respect to the existing literature. Considering these points, the paper would benefit from major revisions to clarify the validation framework, provide stronger evidence for performance improvements, and explicitly highlight its originality.

While the application of Universal Kriging is appropriate, the manuscript does not clearly demonstrate a significant methodological advancement beyond existing geostatistical approaches. Similar frameworks combining altitude-dependent drift and kriging have been explored (maybe in a limited way), and the novelty of this implementation remains unclear. See (a) https://doi.org/10.3390/rs16132379; (b) https://doi.org/10.3390/environsciproc2023028011
Despite in section 3.3. several empirical variogram models were described, the choice of an exponential variogram is not adequately justified in a quantitative way (table/plot with weighted least squares error metrics per model). Moreover, the definition of the lag distance in the experimental variogram is not discussed, despite being a critical parameter in geostatistical modeling. The choice of lag size, number of classes directly affects the variogram estimation and the fitted model. The absence of this information raises concerns about the robustness and reproducibility of the results.
The dataset spans approximately 1.5 months, which provides high temporal resolution; however, it may be helpful to discuss the extent to which this period captures the full range of atmospheric variability. In particular, refractivity conditions can exhibit seasonal dependence, and a brief comment on the representativeness of the selected period would strengthen the study. The choice of the August/September 2018 time window is not explicitly justified. Providing some context on whether this period is climatologically typical or associated with specific atmospheric conditions would improve the clarity of the dataset selection.
The validation procedure based on independent control stations is interesting; however, it is not entirely clear whether this approach corresponds to a systematic cross-validation scheme or to a fixed validation on a limited number of stations. Although the three stations are excluded at each time step, the validation still relies on a very small and fixed subset of locations. Clarifying whether additional stations were tested, or whether alternative validation schemes (e.g., spatial cross-validation) were considered, would improve the robustness and generalizability of the results.

Using only three stations, even if located at different altitudes, may not fully capture the spatial variability of the study domain. While the selection is motivated by elevation differences, a larger and more spatially distributed validation set would provide a more comprehensive assessment of model performance. Having 109 points, you can improve the test set size.

While Figure 9 provides illustrative examples of interpolation results under different propagation regimes, the corresponding temporal references (e.g., specific dates or time steps) are not indicated. This makes it difficult to interpret the context of the selected cases and assess their representativeness. Providing the exact timestamps would improve clarity and reproducibility.
The reported improvement of UK and OK is attributed to the inclusion of the drift term, which is a reasonable explanation. However, the comparison is based solely on point-based error metrics (RMSE and MAE) at a limited number of locations. Including additional analyses could strengthen this conclusion. For instance, comparing the spatial distribution of kriging uncertainties (e.g., kriging variance) or assessing whether the differences between methods are statistically significant would provide a more comprehensive evaluation of the improvement.

While UK shows lower RMSE and MAE values, it would be helpful to assess whether these improvements are statistically significant. Applying appropriate statistical tests (e.g., paired t-tests on prediction errors) could provide stronger support for the claimed performance gains.

Author Response

Please see the attachment

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

The author thoroughly revised my suggestions and they have been accepted.

Reviewer 3 Report

Comments and Suggestions for Authors

Dear Author,

Thank you very much for revising the document, which has been improved. There are still some minor things, related to the language (e.g., "Accesed on" instead of "Accessed on").

After these minor corrections, the paper can be accepted.
Congratulations!

Reviewer 4 Report

Comments and Suggestions for Authors

Authors have improved the manuscript according to my suggestions. Therefore, the paper is ready to be published.