Hybrid Adaptive Segmentation and Morphology-Based Classification of EOG for Automated Detection of Phasic and Tonic REM Sleep
Round 1
Reviewer 1 Report
Comments and Suggestions for Authors
- The paper presents a well-motivated and technically sound hybrid framework that combines adaptive MAD-based segmentation with morphology-aware template matching, addressing clear limitations of fixed-threshold and generic wavelet-based EOG methods.
- While EyeCon provides excellent event-level ground truth, the clinical PSG dataset lacks manual phasic/tonic REM annotations, making it difficult to quantitatively assess classification accuracy at the REM microstate level.
- PSG was performed in young, healthy adults only, thereby potentially restricting the generalizability to elderly populations and patients with complex sleep disorders (e.g., REM behavior disorder, narcolepsy).
- While their use of single-channel EOG is a practical advantage, the lack of information about vertical EOG, EEG or EMG may also limit robustness in certain recording scenarios with strong artifacts and atypical eye movement patterns.
- The study does not include a direct quantitative comparison with existing wavelet-based or threshold-based phasic REM detectors on the same datasets, which would further clarify performance gains.
Author Response
Response to Reviewer 1
The paper presents a well-motivated and technically sound hybrid framework that combines adaptive MAD-based segmentation with morphology-aware template matching, addressing clear limitations of fixed-threshold and generic wavelet-based EOG methods.
While EyeCon provides excellent event-level ground truth, the clinical PSG dataset lacks manual phasic/tonic REM annotations, making it difficult to quantitatively assess classification accuracy at the REM microstate level.
PSG was performed in young, healthy adults only, thereby potentially restricting the generalizability to elderly populations and patients with complex sleep disorders (e.g., REM behavior disorder, narcolepsy).
While their use of single-channel EOG is a practical advantage, the lack of information about vertical EOG, EEG or EMG may also limit robustness in certain recording scenarios with strong artifacts and atypical eye movement patterns.
The study does not include a direct quantitative comparison with existing wavelet-based or threshold-based phasic REM detectors on the same datasets, which would further clarify performance gains.
We thank the reviewer for his feedback and constructive comments. We respond to the individual points below:
Comment 1
While EyeCon provides excellent event-level ground truth, the clinical PSG dataset lacks manual phasic/tonic REM annotations, making it difficult to quantitatively assess classification accuracy at the REM microstate level.
Response:
We thank the reviewer for this important observation. We fully agree that the absence of manual tonic/phasic REM annotations in the clinical PSG dataset prevents direct quantitative evaluation at the REM microstate level. This limitation is inherent to current clinical scoring practice, as tonic/phasic REM microstructure is not part of standard AASM annotations.
To address this point, we have explicitly clarified this limitation in the revised manuscript and explained our evaluation strategy. Specifically, EyeCon data were used for event-level validation with precise ground truth, whereas clinical PSG data were evaluated through physiological plausibility (REM microstructure proportions and EEG spectral signatures), rather than direct microstate-level accuracy.
Location in manuscript:
Discussion, subsection "Limitations".
Comment 2
PSG was performed in young, healthy adults only, thereby potentially restricting the generalizability to elderly populations and patients with complex sleep disorders.
Response:
We agree with the reviewer and have now explicitly acknowledged this limitation. The present study was designed as a methodological validation study conducted in a controlled population of young, healthy adults. This design choice allowed us to isolate and evaluate the performance of the proposed algorithm without the confounding influence of pathological REM patterns. We now explicitly state this as a limitation and outline future validation in elderly and clinical cohorts as a necessary next step. This point is now clearly stated, together with a discussion of future directions.
Location in manuscript:
Discussion, subsection "Limitations".
Comment 3
While their use of single-channel EOG is a practical advantage, the lack of information about vertical EOG, EEG or EMG may limit robustness in certain recording scenarios.
Response:
We thank the reviewer for highlighting this aspect. We have clarified in the Methods section that all segmentation and classification steps were intentionally performed using a single horizontal EOG channel only. EEG and EMG signals were not used for detection or classification, but EEG was analyzed independently to provide physiological validation of the reconstructed REM microstructure. We now explicitly discuss this design choice as a trade-off between simplicity and robustness. While single-channel EOG improves implementability and suitability for ambulatory or resource-limited settings, we acknowledge that multimodal fusion (vertical EOG, EEG, EMG) may further improve robustness in artifact-heavy or atypical recordings. This balance between interpretability and multimodal complexity is now discussed in the Limitations section.
Location in manuscript:
Materials and Methods, subsection "Preprocessing";
Discussion, subsection "Limitations".
Comment 4
The study does not include a direct quantitative comparison with existing wavelet-based or threshold-based phasic REM detectors on the same datasets.
Response:
We acknowledge this comment. A direct numerical benchmark against all existing wavelet- or threshold-based detectors was not a primary aim of this study. Instead, the proposed method was designed as a morphology-aware alternative motivated by known limitations of fixed thresholds and generic wavelet bases.
To address this point, we expanded the Discussion section to include a detailed qualitative comparison with classical threshold-based and wavelet-based approaches, highlighting differences in robustness, morphology sensitivity, and suitability for full-night PSG recordings. We also clarify that a systematic quantitative benchmark across detector families represents a natural direction for future work.
Location in manuscript:
Discussion, subsection "Comparison with Existing Literature".
Reviewer 2 Report
Comments and Suggestions for Authors
The authors evaluated an automated framework for detecting phasic REM based on hybrid adaptive segmentation of EOG data. The framework was applied to EyeCon datasets and yielded a 92.9% correct detection of phasic REM states, which was validated using EEG data. While the findings are of interest, a few questions remain.
Introduction
- While an overview of the use of EOG for classification is provided, a few key as well as recent papers are missing from the introduction [1-4], which should be compared and contrasted, given the similarity in goals.
2. What is the significance of automatic tonic/phasic REM classification?
3. Were there a-priori hypotheses that were considered in this study?
Materials and Methods
4. Clarify the interval used in z-score transformation in line 151, page 4.
5. Be sure to provide appropriate citation for prior use of proposed normalization in EOG data on line 160, page 4.
6. Given the use of a custom kernel that aimed to match a three-phase saccade morphology, was there a prior validation of the custom kernel in previously labeled data?
7. While a high correlation is reported in lines 189-191, what was the exact value and how did it compare with other kernels?
8. Did the authors consider a comparison of their approach with other kernels or architectures to validate changes in performance?
Results
9. Further clarification on the relative percentage and absolute increases in low-beta and gamma power during phasic REM sleep would be helpful?
10. What were the hyper parameters used for the proposed approach to aid in reproducibility?
11. How well did the algorithm generalize across use of unseen data from different participants? What was the range of performance?
Discussion
12. How does performance compare and contrast with prior work [1-4]?
13. Good overview of strengths and limitations, but did the authors consider the use of alternative architectures or kernels for feature extraction?
Literature cited:
[1] Rechichi, Irene, et al. "Single‐channel EEG classification of sleep stages based on REM microstructure." Healthcare technology letters 8.3 (2021): 58-65.
[2] Cooray, Navin, et al. "Detection of REM sleep behaviour disorder by automated polysomnography analysis." Clinical Neurophysiology 130.4 (2019): 505-514.
[3] Rahman, Md Mosheyur, Mohammed Imamul Hassan Bhuiyan, and Ahnaf Rashik Hassan. "Sleep stage classification using single-channel EOG." Computers in biology and medicine 102 (2018): 211-220.
[4] Maiti, Suvadeep, Shivam Kumar Sharma, and Raju S. Bapi. "Enhancing healthcare with EOG: A novel approach to sleep stage classification." ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2024.
Author Response
Response to Reviewer 2
The paper presents a well-motivated and technically sound hybrid framework that combines adaptive MAD-based segmentation with morphology-aware template matching, addressing clear limitations of fixed-threshold and generic wavelet-based EOG methods.
While EyeCon provides excellent event-level ground truth, the clinical PSG dataset lacks manual phasic/tonic REM annotations, making it difficult to quantitatively assess classification accuracy at the REM microstate level.
PSG was performed in young, healthy adults only, thereby potentially restricting the generalizability to elderly populations and patients with complex sleep disorders (e.g., REM behavior disorder, narcolepsy).
While their use of single-channel EOG is a practical advantage, the lack of information about vertical EOG, EEG or EMG may also limit robustness in certain recording scenarios with strong artifacts and atypical eye movement patterns.
The study does not include a direct quantitative comparison with existing wavelet-based or threshold-based phasic REM detectors on the same datasets, which would further clarify performance gains.
Comment 1
Several key and recent papers are missing from the introduction and should be compared and contrasted.
Response:
We thank the reviewer for this suggestion. The Introduction has been expanded to include the recommended recent and relevant studies. We now explicitly compare these works with the present study and clarify that while studies [1–4] primarily rely on wavelet-based, fixed-threshold, or architecturally more complex models (e.g., deep networks), our approach combines MAD-adaptive segmentation with morphology-informed template matching, emphasizing interpretability, low computational complexity, and suitability for single-channel EOG configurations.
Location in manuscript:
Introduction, paragraphs discussing related work and study positioning.
Comment 2
What is the significance of automatic tonic/phasic REM classification?
Response:
We appreciate this comment. The Introduction has been revised to explicitly state the scientific and clinical relevance of automatic tonic/phasic REM classification, including its role in REM microstructure analysis, physiological interpretation, and potential clinical applications.
Location in manuscript:
Introduction, paragraphs describing the significance of tonic/phasic REM
microstructure.
Comment 3
Were there a-priori hypotheses considered in this study?
Response:
Yes. To clarify this, we have added a dedicated subsection explicitly stating the a-priori hypotheses formulated before the evaluation. The explicitly stated hypotheses address (i) improved robustness of adaptive MAD-based segmentation compared to fixed thresholds, (ii) enhanced physiological relevance achieved through morphology-aware kernel design, and (iii) correspondence between EOG-derived phasic REM and increased EEG beta/gamma activity.
Location in manuscript:
Introduction, subsection "A-priori hypotheses".
Comment 4
Clarify the interval used in z-score transformation.
Response:
We have clarified that z-score normalization was computed separately for each expert-scored REM interval. The reference mean and standard deviation were estimated exclusively from samples within the same REM interval, without using non-REM data.
Location in manuscript:
Materials and Methods, subsection "Preprocessing".
Comment 5
Provide appropriate citation for prior use of the proposed normalization.
Response:
We have added citations to prior EOG-based sleep and eye-movement studies that employ REM-interval or segment-based normalization to improve cross-subject comparability.
Location in manuscript:
Materials and Methods, subsection "Preprocessing".
Comment 6
Was there a prior validation of the custom kernel on labeled data?
Response:
Yes. The revised manuscript now explicitly states that the externally labeled EyeCon dataset was used during method development to verify that the custom kernel reliably matches ground-truth saccade morphologies under controlled conditions.
Location in manuscript:
Materials and Methods, subsection "Morphology Score Using a Custom Saccade Kernel".
Comment 7
What was the exact correlation value and how did it compare with other kernels?
Response:
A systematic numerical benchmark across kernel families was not a primary aim of this study. The custom kernel was designed based on the physiological morphology of triphasic saccades and was pilot-tested on externally labeled EyeCon data during method development. We now clarify this validation step andn discuss quantitative kernel comparison as a natural direction for future work.
Location in manuscript:
Materials and Methods, subsection "Morphology Score Using a Custom Saccade Kernel".
Comment 8
Did the authors consider comparison with other kernels or architectures?
Response:
Yes. This point is now addressed in the Discussion, where we explicitly discuss alternative kernels and architectures and outline them as future research directions.
Location in manuscript:
Discussion, subsection "Methodological Considerations".
Comment 9
Clarify relative and absolute increases in EEG beta and gamma power.
Response:
The Results section has been revised to clarify both the direction and physiological interpretation of EEG spectral differences between phasic and tonic REM. EEG is now explicitly presented as an independent validation modality rather than a component of the detection pipeline.
Location in manuscript:
Results, subsection "EEG Differences Between Phasic and Tonic REM".
Comment 10
What were the hyperparameters used to aid reproducibility?
Response:
We have added a dedicated table summarizing all key hyperparameters and implementation settings used in the segmentation and classification pipeline.
Location in manuscript:
Materials and Methods, end of subsection "Hybrid Adaptive Segmentation"
(Table of hyperparameters).
Comment 11
How well did the algorithm generalize across unseen participants?
Response:
Cross-subject generalization is now explicitly addressed by discussing the low inter-subject variability of phasic REM estimates across participants, supporting stable performance on unseen data despite differences in EOG amplitude and REM density.
Location in manuscript:
Results, subsection "REM Microstructure in Clinical PSG".
Comment 12
How does performance compare with prior work?
Response:
The Discussion has been expanded to provide a detailed comparison with prior threshold-based, wavelet-based, and learning-based approaches, highlighting methodological and conceptual differences.
Location in manuscript:
Discussion, subsection "Comparison with Existing Literature".
Comment 13
Did the authors consider alternative architectures or kernels?
Response:
Yes. This is now explicitly discussed as part of future methodological extensions in the Discussion section.
Location in manuscript:
Discussion, subsection "Methodological Considerations".
Round 2
Reviewer 2 Report
Comments and Suggestions for Authors
The authors have adequately addressed feedback provided and are commended for their revisions.

