Time–Frequency Domain Seismic Signal Denoising Based on Generative Adversarial Networks
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsPlease see attached comments.
Comments for author File: Comments.pdf
Author Response
- The mask signal is not adequately explained.
Response1:
Thank you for your guidance. I have revised it according to your comments. Please see the Yellow highlighted part on line 93, page 2 for details.
- The generator structure is not adequately explained.
Response2:
Thank you for your guidance. I have revised it according to your comments. Please see the Yellow highlighted part on line 130, page 4 for details.
- The RSBU unit is not adequately explained.
Response3:
Thank you for your guidance. I have revised it according to your comments. Please see the Yellow highlighted part on line 179, page 5 for details.
- In Fig. 2 what is RSN.
Response4:
Thank you for your guidance. RSN stands for Residual Shrinkage Network. Based on your suggestion, we think it would be more appropriate to abbreviate it as RSBU(Residual Shrinkage Building Unit) to maintain consistency with the name in Figure 3. We have changed “RSN” to “RSBU” in Figure 2.
- Line 194: what is the Pixshuffle layer.
Response5:
Thank you for your guidance. “PixShuffle” was introduced in the paper “Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network”, which proposes a method to obtain high-resolution feature maps from low-resolution ones by convolving and reorganizing channels across multiple channels. We have added a citation to this paper. Please see the Yellow highlighted part on line 225, page 6 for details.
- In Fig. 3 what is LCBAM?
Response6:
Thank you for your guidance. LCBAM is the lightweight version of CBAM. I have added a description of 'LCBAM'. Please see the Yellow highlighted part on line 235, page 6 for details.
- In equation (4), how do you select α,β?
Response7:
Thank you for your guidance. We did not delve into how to select the values to obtain the best results. I simply let α,β both equal 1. Please see the Yellow highlighted part on line 265, page 7 for details.
- Lines 240-249: there should be some reference on the gradient loss and the way it is used here.
Response8:
Thank you for your guidance. I have added a citation about image gradients on page 7, line 259, and citation about relative usage on line 263.
- Equation (9) has an error.
Response9:
Thank you for your guidance. I have revised to. Please see the Yellow highlighted part on line 312, page 9 for details.
- Line 258: Please give more details about the training process. How many training signals, how many validation signals, how many test signals? Also, noisy signals are created by adding noise to clean signals. Do the clean signals of the training set overlap with the clean signals used to create the test noisy signals? This should be stated explicitly.
Response10:
Thank you for your guidance. I have added the description of the training set, validation set, and test set. Please see the Yellow highlighted part on line 291, page 8 for details.
- Related to the above comment, Tables I,II and III are computed over how many test signals?
Response11:
Thank you for your guidance. Tables I,II and III are computed over 3211 signals, and I have added the relevant description.. Please see the Yellow highlighted part on line 390, page 13 for details.
- Can you explain why the algorithm’s output SNR is worse than the other two methods when the input SNR is above 4 (as shown in Table I) while the correlation coeff. and MAE are better than the other two methods (as shown in Tables II and III) ?
Response12:
Thank you for your guidance. I have added the description of the SNR calculation formula on line 313. We calculate the SNR based on the standard deviations before and after the P-wave arrival. DeepDenoiser and ARDU tend to completely suppress the signal before the P-wave arrival to zero, thereby obtaining high signal-to-noise ratio signals. Since we adopt a discriminator to ensure that the denoised results are consistent with the characteristics of real high-quality signals, and real signals sometimes exhibit small fluctuations before the P-wave arrival, so, the denoising results of our method sometimes cannot completely suppress the waveform before the P-wave arrival to zero, resulting in a lower SNR, especially when the SNR of the signal to be denoised is already in high level. I have added this description in line 370, page 12, please see the Yellow highlighted part for details
- Figure 7 shows promising results but figure 8 is not convincing. As the clean signal is not available, we have no idea which of the three algorithms is better just by looking at the spectrograms. How do we know whether the proposed algorithm also removes useful signal along with the noise in the low frequencies?
Response13:
Thank you for your guidance. Due to the lack of comparison with the true signal, Figure 8 indeed cannot reflect the quality of the denoising results. Considering that Figures 9 and 10 have already demonstrated the denoising effect of this method on field signals, we have removed Figure 8 and its related description.
- 9 is also not convincing. The denoised signal shows no waveforms at all, just the P-arrival while the other two methods retain some waveforms (e.g. DeepDenoiser shows waveforms between 1600 and 1800 km). Probably I am missing something here, please explain what the desired signal is in this case
Response14:
Thank you for your guidance. Figure 9 shows the results arranged according to the epicentral distance for multiple signals. The vertical shadow near 0 and the diagonal shadow around 176 seconds represent the arrivals of P-waves and S-waves, respectively. A high-quality seismic image should clearly show the duration of P-waves and S-waves without other noisy signals. For DeepDenoiser and ARDU, significant residual noisy signals are observed in the range of 1600km-1800km. I have added description of how the Figure 9 was drawn in the Yellow highlighted part in line 417 page 13.
- The same holds for Fig.10: the residual of the proposed method contains many waveforms which seem to be clean/useful signals while the residuals of the other two methods seem more “noisy”.
Response15:
Thank you for your guidance. In figure 10, the residual of the denoising result is the separated noise, and is supposed to be disorderly and cannot observe the P wave and S wave at all. But in (a), apparent seismic phases can be observed, which means Deepdenoiser damages the real signal and remove some of the real signal as noise. I have added description of how the Figure 10 was drawn and what it is supposed to be in the Yellow highlighted part in line 439 page 14.
Author Response File: Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsThe manuscript "Time-Frequency Domain Seismic Signal Denoising Based on Generative Adversarial Networks" presents a deep-learning approach for denoising seismic signals, based on the representation of the signal in the time-frequency domain. Moreover, the proposed approach aims at improving the signal-to-noise ratio and preserving the essential properties of the seismic signal during the denoising process. Some issues still need to be clarified before publication. For example, an extension of the description of the methods used in the study is proposed to provide a more comprehensive explanation of the proposed algorithm. Furthermore, a significant improvement in the explanation of the figures included in the manuscript is recommended. In addition, the variables used in the equations in the text need to be adequately defined in the body of the article. Please see some of the comments below.
(1) Figure 1 illustrates the general structure of the method proposed by the authors. However, this figure is not described anywhere in the manuscript. The authors need to explain the proposed method by linking it to this figure and describing it in detail. The other figures are not discussed much in the text. The authors should describe the figures in more detail so that the reader can better understand their meaning.
(2) As for abbreviations, they must be inserted in brackets immediately after the written terms when they are defined for the first time. This task should be done the first time they appear in the main text. I suggest that the authors check the abbreviations entered in the manuscript. For example, in line 106, replace "STFT" with "short-time Fourier transform (STFT)". Please consult the guidelines for authors at https://www.mdpi.com/journal/applsci/instructions.
(3) The first paragraph is interesting and emphasizes the importance of seismic signals in various analyzes. However, it lacks some references to support these statements. For example, in the excerpt "researchers can infer the characteristics of the Earth's interior, including the properties and boundaries of the mantle, crust, and core", the authors could include some references to tomography issues in seismology (for example, I can suggest Earth and Planetary Science Letters 2022, 593, 117688 and Algorithms 2024, 17(2), 71). For the field of geological exploration, authors may consider Natural Gas Industry B 2022, 9(1), 20-32 and Ore Geology Reviews 2024, 166, 105959. For the topic "Through analyzing the propagation velocities, reflections, refractions, and other characteristics of seismic signals, explorers can deduce the types, thicknesses, and structures of subsurface rock layers, thereby determining the potential locations and scales of oil and gas reservoirs," authors may consider Annual Review of Earth and Planetary Sciences 2002, 30, 259-284 (for reflections) and Geophysical Prospecting 2024, 72, 1189-1195 (for refractions). Finally, IEEE Transactions on Geoscience and Remote Sensing 2022, 60, 1-14, Art No. 5913314 for noisy signals.
(4) In line 57, please provide the reference number [?] of Li et al. Do the same for the following citations.
(5) It is not clear what type of seismic noise is included in the signals. What types of noise were considered? Does the type of noise affect the preservation of the essential features of the seismic signal, while the proposed method reduces the noise?
(6) Could the authors provide more details on the masks used, how they were created and how the distribution of zeros and ones was done?
(7) Please define \tau in equation (2).
(8) A discussion of the associated computational costs is not addressed in the text. Such information is extremely important.
(9) Did the proposed algorithm generate many false positives during the training process? How was this circumvented?
Author Response
- Figure 1 illustrates the general structure of the method proposed by the authors. However, this figure is not described anywhere in the manuscript. The authors need to explain the proposed method by linking it to this figure and describing it in detail. The other figures are not discussed much in the text. The authors should describe the figures in more detail so that the reader can better understand their meaning.
Response1:
Thank you for your guidance. I have added description for Figure 1. Please see the Yellow highlighted part on line 112, page 3 for details. For other figures, we also added description for Figure 2 on line130; for Figure 3 on line 179; for Figure 4 on line 231.
- As for abbreviations, they must be inserted in brackets immediately after the written terms when they are defined for the first time. This task should be done the first time they appear in the main text. I suggest that the authors check the abbreviations entered in the manuscript. For example, in line 106, replace "STFT" with "short-time Fourier transform (STFT)". Please consult the guidelines for authors at https://www.mdpi.com/journal/applsci/instructions.
Response2:
Thank you for your guidance. I have revised it according to your comments. Please see the Yellow highlighted part on line 88, page 2 for details.
- The first paragraph is interesting and emphasizes the importance of seismic signals in various analyzes. However, it lacks some references to support these statements. For example, in the excerpt "researchers can infer the characteristics of the Earth's interior, including the properties and boundaries of the mantle, crust, and core", the authors could include some references to tomography issues in seismology (for example, I can suggest Earth and Planetary Science Letters 2022, 593, 117688 and Algorithms 2024, 17(2), 71). For the field of geological exploration, authors may consider Natural Gas Industry B 2022, 9(1), 20-32 and Ore Geology Reviews 2024, 166, 105959. For the topic "Through analyzing the propagation velocities, reflections, refractions, and other characteristics of seismic signals, explorers can deduce the types, thicknesses, and structures of subsurface rock layers, thereby determining the potential locations and scales of oil and gas reservoirs," authors may consider Annual Review of Earth and Planetary Sciences 2002, 30, 259-284 (for reflections) and Geophysical Prospecting 2024, 72, 1189-1195 (for refractions). Finally, IEEE Transactions on Geoscience and Remote Sensing 2022, 60, 1-14, Art No. 5913314 for noisy signals.
Response3:
Thank you for your guidance. I have added references according to your comments. Please see the Yellow highlighted part on line 30, 33, 36, 41, page 1 for details.
- In line 57, please provide the reference number [?] of Li et al. Do the same for the following citations.
Response4:
Thank you for your guidance. I have added references according to your comments. Please see the Yellow highlighted part on line 58,59,62,64, page 2 for details.
- It is not clear what type of seismic noise is included in the signals. What types of noise were considered? Does the type of noise affect the preservation of the essential features of the seismic signal, while the proposed method reduces the noise.
Response5:
Thank you for your guidance. The STAED dataset we used for training contains a large-scale dataset of global earthquake and non-earthquake signals. Since this dataset includes a sufficient variety of noise types, our model has good generalization ability. We have added a description of this dataset in the yellow highlighted part on line 291, and added an example for high-frequency noise denoising in Figure 8 on line 355.
- Could the authors provide more details on the masks used, how they were created and how the distribution of zeros and ones was done?
Response6:
Thank you for your guidance. The mask is output by the generator, and its size is consistent with the input time-frequency spectrum. The generator determines the value of each element in the mask based on the input time-frequency spectrum. By multiplying the mask with the real and imaginary parts of the time-frequency spectrum respectively, the denoised time-frequency spectrum is obtained. I have added description for the mask on line 93, page2.
- Please define \tau in equation (2).
Response7:
Thank you for your guidance. I have added description for how was defined on line 181, page5.
- A discussion of the associated computational costs is not addressed in the text. Such information is extremely important.
Response8:
Thank you for your guidance. I have analyzed the computational costs of the three models in the yellow highlighted part on line 393, page 13. The computational cost of the proposed method is lower than ARDU. Although the computational cost of DeepDenoiser is significantly lower than the proposed method, since ARDU is an improvement based on DeepDenoiser, simply increasing the parameters of DeepDenoiser to be the same as the proposed method cannot achieve better results. It can be considered that with the same number of parameters, the proposed method would achieve the best results.
- Did the proposed algorithm generate many false positives during the training process? How was this circumvented?
Response9:
Thank you for your guidance. In deep learning, incorrect labels can easily lead to false positives. We selected signals with very high SNR (SNR>50) from STEAD as noise-free signals to avoid the problem of low-quality training data as much as possible. I added the description for how we build the taring, validate and test set on line 293, page8. From the experimental results, the proposed method achieved better performance than DeepDenoiser and ARDU on both synthetic and real data, suppressing noise while reducing damage to real signals.
Author Response File: Author Response.pdf
Round 2
Reviewer 1 Report
Comments and Suggestions for AuthorsThe athors have implemented most of the requested corrections and the paper has been significanlty improved.
Comments on the Quality of English LanguageMinor text corrections could be made.
Reviewer 2 Report
Comments and Suggestions for AuthorsThe authors addressed all questions raised in the first round of reviews.