Next Article in Journal
Regional Characteristics of Summer Precipitation Anomalies in the Northeastern Maritime Continent
Next Article in Special Issue
New Ways to Modelling and Predicting Ionosphere Variables
Previous Article in Journal
Comparison of Bias Correction Methods for Summertime Daily Rainfall in South Korea Using Quantile Mapping and Machine Learning Model
Previous Article in Special Issue
Exploring the Relationship between Geomagnetic Variations and Seismic Energy Release in Proximity to the Vrancea Seismic Zone
 
 
Article
Peer-Review Record

Total Electron Content PCA-NN Prediction Model for South-European Middle Latitudes

Atmosphere 2023, 14(7), 1058; https://doi.org/10.3390/atmos14071058
by Anna Morozova 1,2,*, Teresa Barata 1,3, Tatiana Barlyaeva 1 and Ricardo Gafeira 1,2
Reviewer 1: Anonymous
Reviewer 2:
Atmosphere 2023, 14(7), 1058; https://doi.org/10.3390/atmos14071058
Submission received: 5 June 2023 / Revised: 18 June 2023 / Accepted: 20 June 2023 / Published: 21 June 2023

Round 1

Reviewer 1 Report

Dear Editor,

In this paper, the authors develop a Total Electron Content (TEC) PCA-NN model for middle latitudes, following the steps: 1) development of a prototype using the data from a GNSS receiver installed in Lisbon (Portugal) and 2) test of the prototype using data from three different locations (continental Portugal, the Azores archipelago and Sao Miguel islands/Madeira archipelago). The model developed by the authors is found to perform better than a previously developed PCA-MRM model to forecast TEC.

The manuscript is well-written and the developed procedure is robust. The authors describe in detail every phase of model elaboration. I have just a few relatively minor comments and suggestions.

1) Lines 68-69: "The SCINDA data have several gaps that were filled with ROB TEC data.". I understand this. What would happen if you use only the available SCINDA data or only the ROB data (which I understand is complete)? I am a bit concerned with the mixing of data. Maybe you can comment on this aspect.

2) Lines 138: 139: "Following our previous studies [13] and some other works [e.g., 22-23], we used lags of 1 and 2 days between the TEC and SWp series (SWp series lead)." I am a bit confused. So, do you use SWp predictors from the periods for which you predict TEC? Is this a common procedure? Please clarify.

3) Lines 269 - 271: "No negative daily mean TEC and EOF1 series were allowed: in case NNs forecast negative values of daily mean TEC or EOF1, they were multiplied by -1.". How is this procedure justified? Why, for example, you do not exclude such values or adopt a forecast value of 0? Please explain.

4) I understand that your model underestimates "TEC variations during days with geomagnetic disturbance". How do you think these forecasts (during geomagnetic disturbances) can be improved?

5) While the manuscript is in reasonable English, it would benefit from an editing by a native English person.

 

Reasonable English, but would be useful to have your manuscript checked by a native English person.

Author Response

Reply to Reviewer 1

We are very thankful to the Reviewer for the useful comments. We address his/her comments below.

1) Lines 68-69: "The SCINDA data have several gaps that were filled with ROB TEC data.". I understand this. What would happen if you use only the available SCINDA data or only the ROB data (which I understand is complete)? I am a bit concerned with the mixing of data. Maybe you can comment on this aspect.

The mixing data from different sources, in general, in not recommended as it will affect homogeneity of the studied series and may introduce artificial variability of a studied parameter. However, since this data series is not used to study TEC variability but as a test series to study applicability of a proposed approach to model TEC, we believe that the possible inhomogeneity is not essential. This is, actually, proved by results presented in Sec. 4, when the prototype model is tested on TEC series obtained using the data of the Portuguese network of geodetic receivers (RENEP). The performance of a prototype does not change drastically when the originally used SCINDA+ROB series are replaced by RENEP data (except for Funchal, but this, we believe, is due to its specific location).

2) Lines 138: 139: "Following our previous studies [13] and some other works [e.g., 22-23], we used lags of 1 and 2 days between the TEC and SWp series (SWp series lead)." I am a bit confused. So, do you use SWp predictors from the periods for which you predict TEC? Is this a common procedure? Please clarify.

The forecasting procedure is like this:

We take a TEC series from day_1 to day_31 and submit it to the NN model together with series of predictors, SWp and the TEC series itself, taken from day_0 to day_30 and from day_-1 to day_29. Thus, we introduce two time lags of 1 and 2 days between SWp and TEC variations and, also, what we call “an autoregression terms” since TEC series have significant autocorrelation.

Yes, this approach is used quite often, as is mentioned in the referred papers of other authors, and in other publications referred in our previous paper (about the PCA-MRM model).

3) Lines 269 - 271: "No negative daily mean TEC and EOF1 series were allowed: in case NNs forecast negative values of daily mean TEC or EOF1, they were multiplied by -1.". How is this procedure justified? Why, for example, you do not exclude such values or adopt a forecast value of 0? Please explain.

First time we encounter the problem of “negative forecasted EOF1s” when developing the PCA-MRM model. To our mind they can arise from a not well constrained regression models: since every time a regression was developed automatically without a supervision, this negative forecasted EOF1s appears from time to time. We analyzed these values and how to treat them and found that a simple sign change resolves a problem very well – the amplitude of the forecasted TEC fitted the observations very well. Our decision was based on a fact that if EOF1 <= 0, then we cannot produce a forecast.

Actually, we do not know if any negative EOF1s are produced by NN models, because we accepted the “no negative EOF1” approach from the very beginning.

4) I understand that your model underestimates "TEC variations during days with geomagnetic disturbance". How do you think these forecasts (during geomagnetic disturbances) can be improved?

From the literature we can deduce that models (of any type) developed using only data for geomagnetically disturbed time intervals performs better (not dramatically better but still better) during geomagnetic disturbance that models trained only on quiet days data or mixed days data. Probably, a next stage in the development of our model will be training it both on all data and only on data from disturbed days and providing two forecasts depending on the current level of geomagnetic activity.

5) While the manuscript is in reasonable English, it would benefit from an editing by a native English person.

We will try to follow this advice and find a native speaker to review the paper.

Reviewer 2 Report

Comments to the manuscript atmosphere2432318

Total electron content PCA-NN model for middle latitudes

by Anna Morozova , Teresa Barata, Tatiana Barlyaeva, Ricardo Gafeira

 

The paper is a continuation and development of a series of works of these authors devoted to the creation of a regional forecast model and the study of the TEC behavior during disturbances. It is shown that the combination of PCA with the NN approach leads to better results than the combination of  PCA with the traditional regression method. A number of interesting results are obtained in the paper, which allows it to be recommended for publication after reaction to comments.

 

Main comments

1. Regarding the title of the paper: still middle latitudes are a very wide area, maybe define it more specifically. In addition, the work is devoted to the prediction of TEC, but this is not reflected in the title, as well as in the keywords.

2. The presentation suffers from inconsistency: first in Tables 2-3 the results of comparison of two methods are given, then there is a section "PCA-NN model" and only then there is a reference to NN method, which plays a key role in this work. It is not always clear in the paper what refers to the results for the 1d or 1h time resolution series.

3. Since the PCA-MRM model turned out to be so weak compared to PCA-NN, maybe it should not have been described in such detail in this paper.

4. Line 188, section 3.2: When using statistics, the authors neglect such an estimate as the mean absolute percentage error (MAPE). Since the MAE and RMSE depend on TEC values, comparing these values, for example, at different latitudes is not entirely objective, especially when comparing the results with those of other authors.

5. Lines 244-245: If there were no PCA-NN results in Table 2, one could agree with this conclusion.

6. Lines 269-271: you are using the approach of the Yasyukevich group to determine TEC, but they have an approach to get non-negative TEC values such as. Yury Yasyukevich, Anna Mylnikova and Artem Vesnin GNSS-Based Non-Negative Absolute Ionosphere Total Electron Content, its Spatial Gradients, Time Derivatives and Dierential Code Biases: Bounded-Variable Least-Squares and Taylor Series Sensors 2020, 20, 5702; doi:10.3390/s20195702.

Why do you get negative values with NN?

 

Some inaccuracies

1. Line 17: replace altitudes with latitudes.

2. Reference [2] is missing in the text.

3. Lines 92-94, caption to figure 1: better replace Azores with Furnas and Madeira with Funchal.

 

 

 

Comments to the manuscript atmosphere2432318

Total electron content PCA-NN model for middle latitudes

by Anna Morozova , Teresa Barata, Tatiana Barlyaeva, Ricardo Gafeira

 

The paper is a continuation and development of a series of works of these authors devoted to the creation of a regional forecast model and the study of the TEC behavior during disturbances. It is shown that the combination of PCA with the NN approach leads to better results than the combination of  PCA with the traditional regression method. A number of interesting results are obtained in the paper, which allows it to be recommended for publication after reaction to comments.

 

Main comments

1. Regarding the title of the paper: still middle latitudes are a very wide area, maybe define it more specifically. In addition, the work is devoted to the prediction of TEC, but this is not reflected in the title, as well as in the keywords.

2. The presentation suffers from inconsistency: first in Tables 2-3 the results of comparison of two methods are given, then there is a section "PCA-NN model" and only then there is a reference to NN method, which plays a key role in this work. It is not always clear in the paper what refers to the results for the 1d or 1h time resolution series.

3. Since the PCA-MRM model turned out to be so weak compared to PCA-NN, maybe it should not have been described in such detail in this paper.

4. Line 188, section 3.2: When using statistics, the authors neglect such an estimate as the mean absolute percentage error (MAPE). Since the MAE and RMSE depend on TEC values, comparing these values, for example, at different latitudes is not entirely objective, especially when comparing the results with those of other authors.

5. Lines 244-245: If there were no PCA-NN results in Table 2, one could agree with this conclusion.

6. Lines 269-271: you are using the approach of the Yasyukevich group to determine TEC, but they have an approach to get non-negative TEC values such as. Yury Yasyukevich, Anna Mylnikova and Artem Vesnin GNSS-Based Non-Negative Absolute Ionosphere Total Electron Content, its Spatial Gradients, Time Derivatives and Dierential Code Biases: Bounded-Variable Least-Squares and Taylor Series Sensors 2020, 20, 5702; doi:10.3390/s20195702.

Why do you get negative values with NN?

 

Some inaccuracies

1. Line 17: replace altitudes with latitudes.

2. Reference [2] is missing in the text.

3. Lines 92-94, caption to figure 1: better replace Azores with Furnas and Madeira with Funchal.

 

 

 

Author Response

Reply to Reviewer 2

We are very grateful to the Reviewer for the useful comments and suggestions. We address his/her comments below.

Main comments

  1. Regarding the title of the paper: still middle latitudes are a very wide area, maybe define it more specifically. In addition, the work is devoted to the prediction of TEC, but this is not reflected in the title, as well as in the keywords.

The title and the keywords are modified accordingly.

  1. The presentation suffers from inconsistency: first in Tables 2-3 the results of comparison of two methods are given, then there is a section "PCA-NN model" and only then there is a reference to NN method, which plays a key role in this work. It is not always clear in the paper what refers to the results for the 1d or 1h time resolution series.

Since we compare the performance of the newly developed PCA-NN model to the performance of a previous one (PCA-MRM) we had to include a short description of the previous model and give its metrics, so a reader does not need to search for another paper. Thus, we had to introduce Tabs. 2-3 in the section 3.3.

Almost all metrics are calculated for 1h series except those marked as “1d mean” (Tab. 2). To avoid confusion in the interpretation of Tab. 2 it was modified (correspnding cells of the column “time resolution” are merged and the “1h” is centered vertically).

  1. Since the PCA-MRM model turned out to be so weak compared to PCA-NN, maybe it should not have been described in such detail in this paper.

As we mentioned above, the PCA-NN performance is compared to the performance of the PCA-MRM model. Thus, we had to introduce this model and summarize its performance. We believe that it would be unfair to ask a reader to search and read another paper to get general ideas about out previous model. We tried to minimize the PCA-MRM description (it actually consumes only 35 lines of the text).

  1. Line 188, section 3.2: When using statistics, the authors neglect such an estimate as the mean absolute percentage error (MAPE). Since the MAE and RMSE depend on TEC values, comparing these values, for example, at different latitudes is not entirely objective, especially when comparing the results with those of other authors.

We are grateful to the Reviewer for pointing this useful metric to us. We added MAPE values to Tab. 6. They are confirming conclusions made using other metrics (i.e., model performance with the SCINDA, Cascais and Furnas series are similar and much better than with the Funchal series).

  1. Lines 244-245: If there were no PCA-NN results in Table 2, one could agree with this conclusion.

Well, the performance of the PCA-MRM model is quite good, comparing to other models of that time, as we showed in our previous paper. Of course, between the PCA-MRM and PCA-NN models, the winner is the newest one.

  1. Lines 269-271: you are using the approach of the Yasyukevich group to determine TEC, but they have an approach to get non-negative TEC values such as. Yury Yasyukevich, Anna Mylnikova and Artem Vesnin GNSS-Based Non-Negative Absolute Ionosphere Total Electron Content, its Spatial Gradients, Time Derivatives and Differential Code Biases: Bounded-Variable Least-Squares and Taylor Series Sensors 2020, 20, 5702; doi:10.3390/s20195702. Why do you get negative values with NN?

First time we encounter the problem of “negative forecasted EOF1s” when developing the PCA-MRM model. They probably arise from a not well constrained regression models: since every time a regression was developed automatically without a supervision, this negative forecasted EOF1s appears from time to time. We analyzed these values and how to treat them and found that a simple sign change resolves a problem very well – the amplitude of the forecasted TEC fitted the observations very well. Our decision was based on a fact that if EOF1 <= 0, then we cannot produce a forecast.

Actually, we do not know if any negative EOF1s are produced by NN models, because we accepted the safe “no negative EOF1” approach from the very beginning.

Some inaccuracies

  1. Line 17: replace altitudes with latitudes.

Done

  1. Reference [2] is missing in the text.

Corrected

  1. Lines 92-94, caption to figure 1: better replace Azores with Furnas and Madeira with Funchal.

Captions are updated

Back to TopTop