An Efficient Method to Learn Overcomplete Multi-Scale Dictionaries of ECG Signals
Reviewer 1 Report
After reading your paper I have read ref and my opinion is that this paper adds little information to that paper. The contribution will undoubtedly be greater when you finish the adaptation to all ECG waves, including P and T waves.
Nevertheless, the paper is well written, the methodology seems correct and results are interesting. Si, if the decision is to accept the manuscript, I recommend only to reduce the size of some figures (1,4,5,7,8,9).
We thank the reviewer for his/her comments. Regarding the two issues mentioned:
Note that this paper is an extension of the paper that was presented at TSP 2018 (reference ), which is due to be published in a special issue with selected papers from that conference if it is accepted. Therefore, the basis of the paper should be the same as that of the TSP 2018 paper. However, let us remark that there is a substantial difference between this paper and the TSP paper: in  we constructed the dictionary using a single waveform, whereas here we include multiple waveforms to construct the atoms of the dictionary. This extension was already suggested in  ("If desired, select additional waveforms highly correlated with respect to the remaining candidate waveforms (to obtain representative dictionary atoms) as well as with low absolute correlation with respect to already selected waveforms (to avoid similar atoms)"), but no specific procedure to achieve it was proposed, as we had not developed it yet. In this paper, a novel and precise procedure is described, all the stages are clearly described and simulation results are provided. The addition of multiple waveforms increases the expressive power of the dictionary and improves the performance, as shown in the numerical results section, so we believe this to be a major extension with respect to . Other important extensions with respect to  have also been performed, like the improvement of the resampling stages to avoid edge effects, and many more simulations (including simulations on patients not used to construct the dictionary and on other leads) have been performed. Finally, a much more detailed literature review has been performed and all the stages of the proposed approach have been described in much more detail (this was not possible in the original conference paper due to space constraints). Therefore, we believe that these are important issues and we do not agree that the paper adds little information with respect to . However, we admit that this might not be clearly stated in the original manuscript, so we have added 8 lines at the end of page 2 (in red) trying to explain the main novelties with respect to .
We agree that the size of some figures is too large, but we did it on purpose so that they could be clearly visualized during the review of the paper. In the Latex code, their width is always 0.8\columnwidth (except for figures 2, 3 and 10), meaning that they will only occupy one column in the final two-column version of the paper.
Reviewer 2 Report
The paper presents a new method for sparse learning of ECG waveforms. It is well written, easy to follow, and results are justified by experiments on real data.
Some comments to improve the final version are the following:
* Some proofreading and style check (etc, ":", others) should be addressed.
* Please use first the term and then the acronym (e.g. in LASSO it is made the opposite).
* It is highly recommendable to better describe the need for the method, the motivation and the contribution to the ECG practice.
* The use of a single prototype for each patient can be limited. The sinus rhythm can change even in the same patient. Also, it is not clear how to deal with non-sinus rhythm beats. Please extend on this.
* Which practical application or need does the proposal support? Please clarify, not only with references to previous works, but clearly.
* The out-of-patient performance should be commented, what is the actual performance on a new patient which was not considered on the procedure?
* Why discarding patients in the preprocessing?
* Fig 5., Would it be better to align the beats with respect to the R-wave peak? Otherwise, there is a marked phase jitter on the beat position.
We thank the reviewer for his/her comments. Regarding the issues mentioned:
We have reviewed the whole paper, correcting all the typos and style issues that we have found.
We have corrected the reversed definition of term/acronym in LASSO.
Throughout the paper we describe a novel dictionary construction method that can be later applied in many ECG signal processing applications. Therefore, we believe that it would not be appropriate to describe any particular problem in detail. However, we have introduced 6 lines in page 3 (in red) mentioning several potential applications (including references) where the constructed dictionary would be useful.
We agree that extracting a single waveform per patient can be a limitation. However, since the recordings used correspond to healthy patients and are rather short (less than 2 minutes), a single waveform is enough to represent the average QRS complex for each patient. We are currently trying to develop an efficient method to extract multiple waveforms from each patient, but this is a much more challenging issue. We have introduced a footnote in page 7 (in red) to clarify this issue.
Again, in this paper we focus on the dictionary construction. In the sparse inference and compressed sensing literature it is well-known that better dictionaries result in improved performance, and many sparse inference/compressed sensing applications have been developed in ECG signal processing. However, since we do not address any particular application, we prefer to provide references to several relevant applications instead of describing one specific application in detail.
The out-of-patient performance is analyzed on a test set composed of 11 recordings not used to construct the dictionary. Furthermore, we have also tested the performance of the proposed approach on signals recorded in different leads to the one used to build the dictionary. Two sentences (in red) have been introduced in pages 15 and 16 to clarify this issue, and we have also modified the captions of Table 1 and Figure 10 to emphasize it even more.
In the preprocessing we simply discard those patients for which QRS complexes cannot be reliably extracted, since wrongly extracted QRS complexes could have a very negative impact in the dictionary construction.
We considered whether to align or not to align the R peaks in the proposed approach. Finally, we decided not to align the R peaks because this forced us to insert zeros at both ends of the extracted waveforms, and this had a negative impact on the performance of the sparse inference algorithm used. Furthermore, note that the jitter on the beat position is irrelevant for the final approximation, since we always know the exact location of the R peaks (as they are obtained during the QRS extraction stage) and the overcomplete dictionary is composed of time-shifted versions of all the waveforms. Therefore, from the sparse approximation we can easily extract the exact location of all the R peaks. However, this is an interesting issue that deserves a further study.