Neuroscience-Inspired Deep Learning Brain–Machine Interface Decoder
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsI have reviewed the manuscript titled “Neuroscience-Inspired Deep Learning Brain-Machine Interface Decoder” and find that it addresses an important and timely problem in invasive BMI research by proposing a neuroscience-inspired Single-Direction CNN-LSTM architecture that separates flexion and extension dynamics and demonstrates improved cross-target generalizability and physiologically meaningful co-contraction estimation; however, the manuscript in its current form requires major revision before it can be considered for publication in an international journal. The experimental method delivers complete information about all experimental procedures yet multiple critical issues impact both research findings and scientific procedures: (i) The research depends on data from one macaque and one experimental design without showing any evidence of validation through independent data or subjects which reduces the study's ability to demonstrate stability and universal applicability; (ii) The research fails to show how the single-direction design element impacts results because it lacks a systematic ablation study to assess this particular component against other architectural elements and training parameters; (iii) The paper omits a particular section which describes the new elements of this method compared to previous CNN-LSTM-based BMI decoders except for basic neuroscience information; (iv) The evaluation depends on R² metrics for evaluation but the results would gain more validity by implementing additional evaluation metrics together with a complete statistical testing procedure which includes effect size measurements and multiple testing correction; (v) The research lacks accessible code and processed data and the authors did not include any repository link which creates obstacles for reproducing deep-learning-based research; (vi) The co-contraction estimation results show promising results but the method requires external verification through physiological tests to confirm its precision so the researchers should present these findings with increased caution and additional proof; (vii) The presentation of figures and tables requires improvement through enhanced font sizes and standardized target axis labels because multiple plots need detailed analysis to understand their information; (viii) The paper needs to include a separate section which focuses on limitations and future directions to discuss three essential topics about subject-specific bias and human BMI scalability and musculoskeletal model sensitivity and explainable AI integration for feature interpretation. The research presents an interesting concept through its thorough experimental work yet needs better methodological proof and enhanced explanations of its original aspects and better disclosure of methods and more extensive critical evaluation to achieve publication in a leading academic journal.
Author Response
Comment1: The research depends on data from one macaque and one experimental design without showing any evidence of validation through independent data or subjects which reduces the study's ability to demonstrate stability and universal applicability.
Response1: Thank you very much for your helpful comments on study’s universal applicability. Validate the decoder across diverse subjects with online paradigms can improve the cogency of this research.
We regret that we did not include data from other macaques in the current study. We also acknowledge that the validity of this method could be further strengthened if similar results were confirmed across multiple subjects. We have now addressed this limitation in the Discussion section and outlined plans to explore this in future work. We hope that this revision addresses your concern.
Comment2: The research fails to show how the single-direction design element impacts results because it lacks a systematic ablation study to assess this component against other architectural elements and training parameters;
Response2: Thank you for your insightful comment regarding the lack of a systematic ablation study for the single-direction design element. We acknowledge that an ablation study would provide a more rigorous analysis of this component's specific contribution.
To address this valuable suggestion, we have now conducted additional experiments that systematically isolate and evaluate the single-direction design by comparing it with:
1) A variant that replaces the ReLU function of the output layer as Linear, which is named as LinearNet,
2) A variant named as SharedNet in which the feature extraction layers are shared within the branches.
The results, which we have added to Section 3.3 of the revised manuscript, confirm that the single-direction design contributes to the high generalizability of different targets.
We believe these additions substantially address your concern and thank you for helping us improve the rigor of our work.
Comment3: The paper omits a particular section which describes the new elements of this method compared to previous CNN-LSTM-based BMI decoders except for basic neuroscience information.
Response3: Thank you for your review comment. Regarding your observation that the paper "omits a particular section which describes the new elements of this method compared to previous CNN-LSTM-based BMI decoders except for basic neuroscience information," we fully agree with your assessment. To more clearly highlight the contributions of our study, we have added a new section titled "Proposed Mode" (now Section 2.3) in the revised manuscript, which specifically elaborates on the core improvements in model architecture. We also reverse Figure 1 so that the difference between our proposed model with the previous could be easier to understand. Specifically, we have addressed the following aspects:
- Multi-Branch Architecture: A key distinction of our model from conventional CNN-LSTM decoders is its multi-branch structure. While previous works typically employ a single decoder with shared feature extraction layers to estimate all output variables simultaneously, our model dedicates an independent pair of CNN-LSTM blocks to each joint variable (angular velocity and torque). This design allows each branch to learn specialized features for its specific target, preventing the interference that can occur in fully shared architectures.
- Single-Direction Decomposition: Inspired by the neural encoding of opposing movements, our model decomposes each final output into two unidirectional components. This is achieved by using a ReLU activation function on the output layer of the two blocks within each dedicated pair. This constraint forces one block to learn the positive component (e.g., extension) and the other to learn the negative component (e.g., flexion) of the motion variable. The final, physiologically relevant net motion parameter is then reconstructed by subtracting the negative component from the positive one via a subsequent subtraction layer.
And we also mention it in Section1 introduction part:
We believe that with the addition of this section, the novelty of our study and its distinctions from existing work will be much clearer. Thank you again for your valuable suggestion, which has been instrumental in improving the quality of our paper.
Comment4: The evaluation depends on R² metrics for evaluation but the results would gain more validity by implementing additional evaluation metrics together with a complete statistical testing procedure which includes effect size measurements and multiple testing correction.
Response4: Thank you for this constructive suggestion. We agree that relying solely on R² limits the comprehensiveness of our evaluation, and that a more rigorous statistical framework—including effect size measurements and correction for multiple comparisons—would significantly strengthen the validity of our findings.
In response, we have made the following revisions to the manuscript:
- Effect Size Measurements:To better interpret the magnitude of the observed effects, we have used paired t-tests for our main statistical tests.
- Multiple Comparisons Correction:As you recommended, we have implemented a complete statistical testing procedure. Specifically, when conducting multiple comparisons (e.g., across different time bins, frequency bands, or neural populations), we now apply the Benjamini-Hochberg False Discovery Rate (FDR) correction to control for Type I errors. The significance threshold is set at FDR-adjusted *p* < 0.05.
We have detailed these updated statistical procedures in Section2.2 (Page 6, Lines 160) and reverse the caption under Figure 3 and Figure 4 (Figure 4 and Figure 5 rightnow). We believe these additions have substantially improved the rigor and credibility of our study. Thank you for helping us enhance this work:
Comment5: The research lacks accessible code and processed data and the authors did not include any repository link which creates obstacles for reproducing deep-learning-based research.
Response5: Thank you for your valuable comment regarding the reproducibility of our work. We completely agree that transparency in code and data is crucial for deep-learning-based research.
Due to ongoing institutional review and potential intellectual property considerations related to this project, we are unable to make the code publicly available during the peer-review process. However, we are committed to full transparency upon publication.
To address your concern, we have taken the following steps:
- Detailed Methodology: To ensure reproducibility in the interim, we have expanded the Methods section to include comprehensive details of our deep learning architecture (e.g., layer specifications, hyperparameters, pooling type) and data preprocessing steps. We believe this level of detail is sufficient for a skilled researcher to reimplement our work.
- Reviewer Access (If feasible): If the editor requires access for verification purposes, we are happy to provide the code and data as a supplementary file in the submission system or via a private link during the review process only.
We believe that these detailed descriptions provide a solid foundation for replicating our approach. Furthermore, we have included a statement in the manuscript that the code can be made available upon reasonable request for academic collaboration, subject to institutional approval.
Comment6: The co-contraction estimation results show promising results but the method requires external verification through physiological tests to confirm its precision so the researchers should present these findings with increased caution and additional proof.
Response6: Thank you for your thoughtful feedback regarding the validation of our co-contraction estimation method. We appreciate your recognition of the promising results, and we fully agree that external verification through physiological tests would significantly strengthen the precision and credibility of our approach.
To address your concern, we have made the following revisions to the manuscript:
- Increased Caution in Interpretation: We have carefully revised the language throughout the Results and Discussion sections to reflect a more cautious tone. Specifically, we have:
- Replaced definitive statements (e.g., " We attribute this advantage to the model’s ability to account for muscle co-contraction") with more measured language (e.g., " This advantage may be caused by the measurement of muscle co-contraction ").
- Explicitly acknowledged that while our results are promising, they should be interpreted as preliminary pending external validation.
- Acknowledgment of Limitations:We have expanded the Limitations subsection in the Discussion to explicitly address the need for physiological validation. The revised text reads as follows :
“While our approach successfully distinguished between flexion and extension movements, it is critical to emphasize that we have not yet definitively proven that these decoded patterns correspond to actual joint kinematics, nor that the predicted co-contraction torque precisely matches the true physiological torque. Obtaining ground-truth measurements for these parameters in vivo is inherently challenging. Consequently, these findings should be regarded as preliminary, warranting further validation through dedicated physiological experiments.”
- Future Directions: We have added a sentence in the Conclusion or Future Work section outlining our plan to conduct physiological validation studies (e.g.EMG recordings) in subsequent research.
We believe these revisions appropriately temper our claims while clearly acknowledging the need for additional verification. Thank you for helping us improve the rigor and transparency of our study.
Comment7: The presentation of figures and tables requires improvement through enhanced font sizes and standardized target axis labels because multiple plots need detailed analysis to understand their information.
Response7: Thank you for your constructive feedback regarding the presentation of our figures and tables. We agree that clear and readable visualizations are essential for effectively communicating our findings, and we apologize for any difficulty caused by the original formatting.
In response to your comment, we have thoroughly revised all figures and tables throughout the manuscript. The specific improvements include:
- Increased Font Sizes: We have enlarged the font sizes for all axis labels, tick marks, legends, and figure captions to ensure readability without the need for magnification. All text elements now meet the standard journal guidelines for figure presentation. Like Figure 1, Figure 4 and Figure 5
- Standardized Axis Labels: We have standardized the axis labels across all plots to ensure consistency. For example, in Figure 4, we set all the axes in the same scale for figures showing hand positions
- Improved Figure Layout: To improve clarity, the original Figure 1 is now presented as two separate figures showing Figure 1 the model architecture and Figure 2 the experimental procedure.
We believe these revisions have substantially improved the clarity and professionalism of our visual presentation. Please find the updated figures and tables in the revised manuscript. Thank you for helping us enhance the quality of our work.
Comment8: The paper needs to include a separate section which focuses on limitations and future directions to discuss three essential topics about subject-specific bias and human BMI scalability and musculoskeletal model sensitivity and explainable AI integration for feature interpretation.
Response8: Thank you for this excellent suggestion. We agree that a dedicated discussion of limitations and future directions is essential for providing a balanced and comprehensive perspective on our work. In response to your comment, we have added a new "Future and Limitation" section to the manuscript (Page 18, Lines 459), where we thoroughly address the four essential topics you highlighted.
Below is a summary of the discussion included in this new section:
- Subject-Specific Bias: We acknowledge that our current approach may be influenced by subject-specific biases due to the limited number of subjects. We discuss how this bias could affect the generalizability of our findings and propose that future work should incorporate larger and more diverse cohorts, as well as domain adaptation techniques, to mitigate such bias.
- Scalability to Human BMI: While our study demonstrates promising results in non-human primates, we recognize that scaling the approach to human brain-machine interfaces (BMI) presents additional challenges, including differences in signal fidelity, electrode stability, and neural plasticity. We outlined a roadmap for translation, which includes validating the method in human intracortical datasets and adapting the decoder to accommodate the unique characteristics of human neural signals.
- Musculoskeletal Model Sensitivity: We discuss the sensitivity of our co-contraction estimates to the parameters and assumptions embedded in the musculoskeletal model used. We note that inaccuracies in model parameters (e.g., muscle attachment points, moment arms) could propagate errors into the torque predictions. Future directions include performing sensitivity analyses and exploring data-driven or physical-informed model to improve robustness.
Comment: By the way, there are some papers that might be of interest to you (free to download and read).
Response: Thank you for your kindness in sharing these relevant references. We greatly appreciate you taking the time to provide these resources, and we will certainly read them with interest.
We have carefully reviewed the suggested papers and find that they offer valuable insights into BCI decoding with neural network and the development or challenge about brain activity decoding. Where appropriate, we have incorporated these works into our revised manuscript to strengthen the relevant discussions and contextualize our findings within the broader literature.
- Electroencephalogram-Based Motor Imagery Signals Classification Using a Multi-Branch Convolutional Neural Network Model with Attention Blocks:
- EEG Channel Selection Techniques in Motor Imagery Applications: A Review and New Perspectives
- A Review of Brain Activity and EEG-Based Brain–Computer Interfaces for
Rehabilitation Application
Thank you again for your thoughtful suggestion and support.
Author Response File:
Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsThis manuscript proposes a neuroscience-inspired Single-Direction CNN-LSTM decoder for upper-limb joint angular velocity and torque decoding from invasive motor cortical recordings in a macaque. The idea of separately modeling flexion and extension using parallel branches is interesting, and the manuscript addresses an important problem in BMI research, namely decoder generalizability across movement contexts. The study is potentially valuable; however, the current version requires substantial revision before it can be considered for publication.
Major comments
-
The overall presentation and manuscript formatting need careful revision.
The manuscript still contains obvious template/place-holder text such as “Journal Not Specified,” incomplete citation metadata, and formatting artifacts. In addition, several figures appear distorted or contain unreadable symbols. These issues significantly reduce readability and give the impression that the manuscript is not yet ready for peer review. -
The English language requires substantial improvement throughout the manuscript.
Although the main ideas can be understood, many sentences are grammatically incorrect or awkwardly phrased. Examples include inaccurate article use, tense inconsistency, unclear subject-object relations, and non-native scientific phrasing. Extensive language editing by a fluent scientific English speaker is strongly recommended. -
The novelty should be clarified more precisely.
The main innovation appears to be the use of separate branches for extension and flexion decoding, followed by subtraction to obtain the net output. However, the manuscript should better explain how this differs conceptually and practically from standard multi-output architectures, and why this design is expected to improve generalization beyond simply increasing model capacity. -
The biological/neuroscientific rationale is interesting but currently not sufficiently validated.
The model is motivated by prior observations that joint movement variables in different directions may be encoded in distinct neural patterns. However, the current manuscript mostly demonstrates improved decoder performance, not direct biological validation of the underlying modular encoding hypothesis. The authors should be more cautious in wording and clearly distinguish between biological inspiration and biological proof. -
The data acquisition and dataset construction need clearer description.
The manuscript states that neurons were not recorded simultaneously and that recording sites were changed daily, after which data were resampled and combined into one large dataset. This is a critical methodological point. The authors should explain much more clearly:-
how neurons recorded across days were combined,
-
whether pseudo-population construction was used,
-
how trial alignment across sessions was ensured,
-
whether cross-validation splits were performed at the trial level or session level,
-
and whether any information leakage may have occurred between training and test sets.
-
-
The generalizability experiment design needs stronger justification.
The model was trained using Targets IV and V and then fine-tuned on the other targets. The rationale for selecting Targets IV and V is based on trajectory coverage in motion space, but this choice may introduce bias. The authors should test whether the conclusions remain valid under different target selection schemes. Otherwise, the claim of improved generalizability may be overstated. -
Fine-tuning uses a substantial amount of target-specific data.
The manuscript states that 35–40% of the target data were used for fine-tuning. This is not a very small amount of data, and therefore the claim of strong transfer/generalization should be toned down. The authors should discuss this limitation more explicitly and ideally provide results across multiple fine-tuning sizes (e.g., 5%, 10%, 20%, 40%). -
Statistical analysis should be strengthened.
The manuscript reports p-values from independent t-tests across five-fold cross-validation results. The authors should justify whether the independence assumption is valid for fold-based results. In addition, it would be helpful to report effect sizes and confidence intervals, not only p-values. Please also specify whether tests were one-tailed or two-tailed and whether multiple-comparison correction was applied. -
The definition of co-contraction is currently operational rather than validated.
Defining co-contraction torque as the minimum of extension and flexion branch outputs is an intuitive proxy, but it is not a direct physiological measurement. The authors acknowledge this limitation, which is good, but they should be more conservative in interpreting the red curve in Figure 7. At present, this analysis is hypothesis-generating rather than definitive validation of co-contraction decoding. -
The description of the decoder architecture needs to be made fully reproducible.
Although Table 1 provides some hyperparameters, the current description still leaves ambiguity. Please report clearly:-
the exact tensor dimensions at each stage,
-
padding and stride settings for each convolution,
-
whether separable convolutions were actually used or standard Conv2D,
-
the precise pooling configuration,
-
dropout placement,
-
initialization method,
-
number of trainable parameters,
-
and the exact early stopping criterion.
Reproducibility is especially important for a methodological paper.
-
-
Some equations and notation are inconsistent or insufficiently explained.
For example, symbols are sometimes introduced unclearly, and notation for variables such as torque, angles, and time-series dimensions is not always consistent. Please carefully revise the equations and ensure every symbol is defined exactly once and used consistently thereafter. -
Figures require major improvement.
Several figures are difficult to read, especially Figures 1, 3, and 4. The font size is too small in many places, some labels are unclear, and several symbols appear corrupted. Figure 1 in particular should be redrawn in a much cleaner way. Figures should be understandable without excessive effort. -
The comparison with baseline models could be strengthened.
The manuscript compares the proposed approach with a conventional CNN-LSTM and linear regression. These are reasonable baselines, but the authors may consider including an additional modern baseline or ablation analysis, for example:-
a branch model without subtraction,
-
a multi-head model with shared encoder,
-
or a simpler model with separate outputs but no directional constraint.
Such analyses would help isolate which component truly contributes to performance.
-
-
Animal ethics information should be explicitly stated.
The manuscript should clearly indicate approval by the appropriate institutional animal care and use committee, including approval number if available.
Minor comments
-
The title is interesting but could be made more specific by mentioning upper-limb joint decoding and flexion/extension modularity.
-
In the abstract, please avoid overstatement such as “physiologically meaningful co-contraction patterns” unless directly validated.
-
“Brain-machine interface” and “brain–machine interface” should be standardized throughout.
-
Please check singular/plural consistency, for example “many researches” should be revised.
-
Some references appear very recent or in preprint form; please ensure they are appropriate and accurately cited.
-
The manuscript should explicitly state the number of total trials retained after quality filtering.
-
Please clarify why negative R² values were set to zero in Figure 4; this may conceal poor performance.
-
Please revise all figure legends for clarity and self-containment.
Author Response
Comment1: The overall presentation and manuscript formatting need careful revision.
Respones1: Thank you very much for your thoughtful feedback and for taking the time to review our manuscript. We sincerely appreciate your valuable comments and the opportunity to revise our work accordingly.
We have carefully addressed all the issues you raised. Specifically:
- Manuscript Formatting and Template Text: We have thoroughly revised the manuscript to remove all placeholder text, including “Journal Not Specified” and any incomplete citation metadata. The formatting has been standardized in accordance with the journal’s guidelines to ensure a clean and professional presentation.
- Figures and Symbols: All figures have been carefully reviewed and regenerated to ensure that they are clear, properly formatted, and free from distortion. In addition, we add one figure and more captions to illustrate more clearly. Any unreadable symbols have been corrected to improve readability and visual quality.
We believe these revisions have significantly enhanced the clarity and professionalism of the manuscript, and we hope it is now suitable for further consideration in the peer review process.
Comment2: The English language requires substantial improvement throughout the manuscript.
Response2: Thank you for your honest and helpful feedback. We acknowledge that the English language usage in the original manuscript required significant improvement.
To address this, we have undertaken a thorough language editing process. The manuscript has undergone professional scientific editing to correct grammatical errors, improve sentence structure, resolve tense inconsistencies, and ensure clarity of expression. We believe the manuscript now reads fluently and meets the language standards expected for peer review.
We appreciate your guidance and hope the revised manuscript is now suitable for further consideration.
Comment3: The novelty should be clarified more precisely.
Response3: Thank you for your helpful comment. We have revised the manuscript to more precisely clarify the novelty of our approach. We now provide a clear explanation of how our architecture—using separate branches for extension and flexion decoding with subtraction—differs conceptually from standard multi-output architectures. Specifically, we have addressed the following aspects:
- Multi-Branch Architecture: A key distinction of our model from conventional CNN-LSTM decoders is its multi-branch structure. While previous works typically employ a single decoder with shared feature extraction layers to estimate all output variables simultaneously, our model dedicates an independent pair of CNN-LSTM blocks to each joint variable (angular velocity and torque). This design allows each branch to learn specialized features for its specific target, preventing the interference that can occur in fully shared architectures.
- Single-Direction Decomposition: Inspired by the neural encoding of opposing movements, our model decomposes each final output into two unidirectional components. This is achieved by using a ReLU activation function on the output layer of the two blocks within each dedicated pair. This constraint forces one block to learn the positive component (e.g., extension) and the other to learn the negative component (e.g., flexion) of the motion variable. The final, physiologically relevant net motion parameter is then reconstructed by subtracting the negative component from the positive one via a subsequent subtraction layer. And we also mention it in Section1 introduction part
We believe these revisions address your concern and strengthen the manuscript. Thank you again for your thoughtful feedback.
Comment4: The biological/neuroscientific rationale is interesting but currently not sufficiently validated.
The model is motivated by prior observations that joint movement variables in different directions may be encoded in distinct neural patterns. However, the current manuscript mostly demonstrates improved decoder performance, not direct biological validation of the underlying modular encoding hypothesis. The authors should be more cautious in wording and clearly distinguish between biological inspiration and biological proof.
Response4: Thank you for this thoughtful comment. We agree that the biological rationale, while motivating, requires more careful framing. In the revised manuscript, we have taken care to more clearly distinguish between biological inspiration and biological validation.
Specifically, we have revised the language throughout to avoid overstating the biological evidence. We now explicitly state that our architecture is inspired by prior observations of direction-selective neural encoding, and we emphasize that the current study demonstrates improved decoding performance rather than providing direct validation of the modular encoding hypothesis
Comment5: The data acquisition and dataset construction need clearer description.
Response5: Thank you for this critical methodological comment. We agree that the data acquisition and dataset construction procedures require clearer description to ensure transparency and reproducibility. In response, we have substantially revised the Methods section to address each of the points you raised.
Specifically, we now provide a detailed explanation of the following:
- Combining neurons recorded across days: We clarify that because neurons were not recorded simultaneously and recording sites were changed daily, we constructed a pseudo-population by aggregating neural activity across sessions. This approach assumes that the recorded neural population, although varying day to day, reflects consistent underlying neural representations of the motor task.
- Pseudo-population construction: We explicitly state that pseudo-population methods were used to combine non-simultaneously recorded neurons, and we describe the rationale and assumptions underlying this approach.
- Trial alignment across sessions:We explain that trials were aligned based on movement onset to ensure consistent temporal alignment across sessions and recording days. All trials were time-normalized to a common reference frame before concatenation.
- Cross-validation splits:We clarify that cross-validation was performed at the session level, not the trial level. That is, all trials from entire recording sessions were assigned exclusively to either training or test sets. This approach prevents data from the same session—which may share correlated noise or non-stationarities—from appearing in both sets, thereby avoiding information leakage.
- To give a better understanding of data acquisition and dataset construction procedures, we add a new Figure 2 inside the manuscript.
We appreciate your thorough review, which has helped us significantly improve the methodological clarity of our manuscript.
Comment6: The generalizability experiment design needs stronger justification.
Response6: Thank you for this insightful comment. We agree that the generalizability experiment design requires stronger justification, and we appreciate the opportunity to address this concern.
In the original manuscript, we selected Target IV and V as the training set because they represent the most extensive coverage of the motion space, providing a rich and diverse set of movement trajectories, so we think the generalizability of the model pre-trained on these two targets will be the best. Our intention was to train the model on a comprehensive foundation before fine-tuning to new targets.
We acknowledge that this specific selection may introduce bias, and we agree that the validity of our conclusions should not depend on a single training set configuration. To address this concern, we have performed additional analyses using alternative target selection schemes.
Specifically, we repeated the generalizability experiment using Target I and II, Target VII and VIII as the initial training set, followed by fine-tuning on the remaining targets. Then we compared the mean of these three combinations. The result in Table1 shows that the decoder pre-trained on Targets IV and V, while not achieving the highest performance on any single variable, produced the most consistent and balanced results across all output measures. Thus our choice can be proven.
Thank you for your constructive feedback, which has helped us improve the rigor of our experimental design.
Comment7: Fine-tuning uses a substantial amount of target-specific data.
The manuscript states that 35–40% of the target data were used for fine-tuning. This is not a very small amount of data, and therefore the claim of strong transfer/generalization should be toned down. The authors should discuss this limitation more explicitly and ideally provide results across multiple fine-tuning sizes (e.g., 5%, 10%, 20%, 40%).
Response7: Thank you for this important observation. We agree that the proportion of target-specific data used for fine-tuning (35–40%) is not negligible, and we appreciate the opportunity to address this limitation more rigorously.
In response to your comment, we have reduced the fine-tuning data proportion to 20% and repeated all experiments to further validate our results.
We appreciate your guidance in helping us present our findings with appropriate scientific caution and transparency.
Comment8: Statistical analysis should be strengthened.
The manuscript reports p-values from independent t-tests across five-fold cross-validation results. The authors should justify whether the independence assumption is valid for fold-based results. In addition, it would be helpful to report effect sizes and confidence intervals, not only p-values. Please also specify whether tests were one-tailed or two-tailed and whether multiple-comparison correction was applied.
Response8: Thank you for this careful and constructive comment regarding our statistical analysis. We agree that the statistical reporting requires strengthening, and we have thoroughly revised this section to address each of your points.
- Independence Assumption:We acknowledge that results from five-fold cross-validation are not strictly independent, as the folds are derived from the same dataset and share overlapping training data. In the revised manuscript, we have replaced the independent t-tests with two-tailed paired t-tests to account for the paired nature of the comparisons across folds. We have also added a justification for our choice of statistical test in the Methods section.
- Multiple Comparisons Correction:As you recommended, we have implemented a complete statistical testing procedure. Specifically, when conducting multiple comparisons (e.g., across different time bins, frequency bands, or neural populations), we now apply the Benjamini-Hochberg False Discovery Rate (FDR) correction to control for Type I errors. The significance threshold is set at FDR-adjusted *p* < 0.05.
We have detailed these updated statistical procedures in Section2.2 (Page 6, Lines 160) and reverse the caption under Figure 3 and Figure 4 (Figure 4 and Figure 5 rightnow).
We believe these additions have substantially improved the rigor and credibility of our study. Thank you for helping us enhance this work.
Comment9: The definition of co-contraction is currently operational rather than validated.
Defining co-contraction torque as the minimum of extension and flexion branch outputs is an intuitive proxy, but it is not a direct physiological measurement. The authors acknowledge this limitation, which is good, but they should be more conservative in interpreting the red curve in Figure 7. At present, this analysis is hypothesis-generating rather than definitive validation of co-contraction decoding.
Response9: Thank you for this important observation. We agree that our definition of co-contraction torque as the minimum of the extension and flexion branch outputs is an operational proxy rather than a direct physiological measurement. We appreciate your constructive feedback, which has helped us more appropriately frame this analysis.
To address your concern, we have made the following revisions to the manuscript:
- Increased Caution in Interpretation: We have carefully revised the language throughout the Results and Discussion sections to reflect a more cautious tone. Specifically, we have:
- Replaced definitive statements (e.g., " We attribute this advantage to the model’s ability to account for muscle co-contraction") with more measured language (e.g., " This advantage may be caused by the measurement of muscle co-contraction ").
- Explicitly acknowledged that while our results are promising, they should be interpreted as preliminary pending external validation.
- Revised Interpretation of Figure 7:In response to your comment, we have substantially revised the interpretation of the results shown in Figure 7 :
- Acknowledgment of Limitations:We have expanded the Limitations subsection in the Discussion to explicitly address the need for physiological validation. The revised text reads as follows :
“While our approach successfully distinguished between flexion and extension movements, it is critical to emphasize that we have not yet definitively proven that these decoded patterns correspond to actual joint kinematics, nor that the predicted co-contraction torque precisely matches the true physiological torque. Obtaining ground-truth measurements for these parameters in vivo is inherently challenging. Consequently, these findings should be regarded as preliminary, warranting further validation through dedicated physiological experiments.”
- Future Directions: We have added a sentence in the Conclusion or Future Work section outlining our plan to conduct physiological validation studies (e.g.EMG recordings) in subsequent research.
We believe these revisions appropriately frame the analysis as exploratory and hypothesis-generating, aligning with the level of evidence presented. Thank you for your guidance in helping us present this aspect of our work with appropriate scientific caution
Comment10: The description of the decoder architecture needs to be made fully reproducible.
Response10: Thank you for your valuable comment regarding the reproducibility of our work. We completely agree that transparency in code and data is crucial for deep-learning-based research.
To address your concern, we have taken the following steps:
- Detailed Methodology: To ensure reproducibility in the interim, we have expanded the Methods section to include comprehensive details of our deep learning architecture (e.g., layer specifications, hyperparameters, pooling type) and data preprocessing steps. We believe this level of detail is sufficient for a skilled researcher to reimplement our work.
- Clear Illustration: We have provided a detailed illustration of our model architecture in the revised Figure 1. We believe this schematic makes the connection relationships among the layers easier to understand.
We believe that these detailed descriptions provide a solid foundation for replicating our approach. Furthermore, we have included a statement in the manuscript that the code can be made available upon reasonable request for academic collaboration, subject to institutional approval.
Comment11: Some equations and notations are inconsistent or insufficiently explained.
For example, symbols are sometimes introduced unclearly, and notation for variables such as torque, angles, and time-series dimensions is not always consistent. Please carefully revise the equations and ensure every symbol is defined exactly once and used consistently thereafter.
Response11: Thank you for this careful observation. We agree that consistency in equations and notation is essential for clarity, and we have thoroughly revised this aspect of the manuscript.
In response to your comment, we have:
- Standardized notation throughout: We have systematically reviewed all equations and variables to ensure consistent notation across the manuscript. Symbols are now defined at their first appearance and used uniformly thereafter.
- Clarified symbol definitions: Each symbol (e.g., for joint angles, torques, time-series dimensions) is now explicitly defined when first introduced. We have ensured that the same symbol is not used to represent different quantities and that similar quantities are represented with consistent notation.
- Improved equation presentation: We have added explanatory text adjacent to key equations to clarify the role of each term and the dimensions of the variables involved. This includes specifying the time-series dimensions for input and output variables.
- Cross-checked with figures and text: We have verified that the notation used in equations is consistent with that used in figures, tables, and the main text to avoid confusion.
These revisions have been applied throughout the manuscript, with particular attention to the Methods section and any sections containing mathematical formulations. We believe these changes significantly improve the readability and precision of the technical presentation.
Comment12: Figures require major improvement.
Several figures are difficult to read, especially Figures 1, 3, and 4. The font size is too small in many places, some labels are unclear, and several symbols appear corrupted. Figure 1 in particular should be redrawn in a much cleaner way. Figures should be understandable without excessive effort.
Response12: Thank you for this important feedback. We agree that the quality and clarity of our figures, particularly Figures 1, 3, and 4, required significant improvement. We have thoroughly revised all figures to address the issues you raised:
Figure 1: We have split the original Figure 1 into two separate figures—now Figure 1 and Figure 2—to improve clarity and facilitate understanding. In Figure 1, we provide a detailed illustration of both our proposed model and the conventional architecture, enabling a clear comparison between the two. In Figure 2, we illustrate the data acquisition and dataset construction procedures, along with a schematic of the simulated monkey experiment.
Figures 3 and 4 (Figure 4 and 5 now): We have carefully revised these figures to improve readability. Key improvements include:
- Increasing font sizes for all axes labels, legends, and annotations to ensure legibility at printed scale.
- Replacing unclear or corrupted symbols with properly rendered characters.
- Simplifying where possible to reduce visual clutter while maintaining all necessary information.
- Ensuring consistent color schemes and line styles across related figures.
All Figures: We have conducted a systematic review of every figure in the manuscript to ensure that:
- All text elements are of sufficient size and resolution.
- Labels are clear and consistent with the notation used in the main text.
- Each figure is self-contained and understandable without excessive reference to the text.
We have also verified that all figures meet the journal's resolution and formatting requirements. We believe these revisions have substantially enhanced the visual clarity and professional presentation of our work.
Comment13: The comparison with baseline models could be strengthened.
Response13: Thank you for this constructive suggestion. We agree that including additional baselines and ablation analyses would strengthen the comparison and help isolate the contribution of key architectural components. In response, we have conducted the additional experiments you recommended.
Specifically, we have added the following models as baselines:
1) A variant that replaces the ReLU function of the output layer as Linear, which is named as LinearNet,
2) A variant named as SharedNet in which the feature extraction layers are shared within the branches.
The results, which we have added to Section 3.3 of the revised manuscript, confirm that the single-direction design contributes to the high generalizability of different targets.
The results of these ablation analyses show as Table2:
We believe these additions substantially address your concern and thank you for helping us improve the rigor of our work.
Comment14: Animal ethics information should be explicitly stated.
The manuscript should clearly indicate approval by the appropriate institutional animal care and use committee, including approval number if available
Response14: Thank you for this important comment. We agree that animal ethics information must be explicitly stated. In response, we have added the following information to the Acknowledge section:
All procedures for animal care and experimental protocols were in accordance with the National Institutes of Health Guide for the Care and Use of Laboratory Animals and were approved by the Animal Experimentation Committee of the National Institute for Physiological Sciences.
Thank you for bringing this to our attention
In addition, we have carefully reviewed the suggested papers and find that they offer valuable insights into BCI decoding with neural network and the development or challenge about brain activity decoding. Where appropriate, we have incorporated these works into our revised manuscript to strengthen the relevant discussions and contextualize our findings within the broader literature.
- Electroencephalogram-Based Motor Imagery Signals Classification Using a Multi-Branch Convolutional Neural Network Model with Attention Blocks:
- EEG Channel Selection Techniques in Motor Imagery Applications: A Review and New Perspectives
- A Review of Brain Activity and EEG-Based Brain–Computer Interfaces for
Rehabilitation Application
Thank you again for your thoughtful suggestion and support.
Author Response File:
Author Response.pdf
Reviewer 3 Report
Comments and Suggestions for Authors The manuscript proposes a “Single-Direction CNN–LSTM” decoder that decomposes each output into nonnegative extension and flexion components (ReLU) and combines them via subtraction to predict net joint angular velocity and torque. The authors report similar performance to a conventional CNN–LSTM when trained on all targets, and improved performance in a cross-target setting when pretrained on two targets and subsequently fine-tuned (primarily at the output layers). The manuscript also proposes a “co-contraction” proxy derived from the minimum of the two branch outputs.  The topic (robust/generalizable neural decoding) is important and suitable. However, the manuscript overstates key claims (notably “generalizability”), lacks essential controls/ablations for the proposed mechanism, and uses under-specified preprocessing/statistical procedures that reduce confidence in the results. ## Major comments 1. The single-direction approach uses paired branches per output, which likely changes total parameters and effective capacity relative to the conventional CNN–LSTM baseline. Parameter counts, FLOPs, and training time are not reported. Report those including total parameter counts for each model (and ideally FLOPs). 2. It is unclear whether gains arise from (i) two-branch decomposition, (ii) nonnegativity (ReLU), (iii) fixed subtraction, or simply optimization/capacity differences. Suggest ablation test (e.g Two-branch model without nonnegativity (no ReLU; signed branch outputs), Two-branch model with learned combination (replace fixed subtraction with a learned linear mixing layer)) 3. The manuscript uses a pseudo-population (non-simultaneous neurons) and resampling/trajectory similarity procedures, plus trial-averaging and reference trajectory computation. Some steps are described as using “all available motion data,” which risks train–test leakage under cross-validation.  Explicitly document the fold-wise pipeline (what is computed on training data vs test data). 4. Independent t-tests across CV folds are not well-justified (folds are paired across models and not independent samples), and multiple comparisons are not controlled. Use paired statistical tests across folds (paired t-test or Wilcoxon signed-rank). Apply multiple-comparison correction (Holm or FDR) across outputs/conditions. 5. Clamping negative R² to zero (noted in a figure caption) biases metrics if used for inference. Do not clamp negative R² to zero for inferential statistics; if clamped for visualization, still report unclamped values in text/tables and use those for tests. 6. Torque prediction is an important outcome, yet the inverse dynamics/manipulandum parameters, units, filtering, and differentiation details are incomplete. Author Response
Comment1: The single-direction approach uses paired branches per output, which likely changes total parameters and effective capacity relative to the conventional CNN–LSTM baseline. Parameter counts, FLOPs, and training time are not reported. Report those including total parameter counts for each model (and ideally FLOPs).
Response1: Thank you for this insightful comment. We agree that reporting parameter counts, FLOPs, and training time is essential for a fair comparison of model efficiency and capacity.
In accordance with your suggestion, we have now added the following information to the manuscript (see Section 2.7 Environment and hyperparameter):
- Total parameter counts: We report the total number of trainable parameters for each model, including the conventional CNN–LSTM baseline and our proposed single-direction approach with paired branches per output.
- Training time:We include the total training time measured on the same hardware (e.g., NVIDIA Tesla V100) to ensure a fair comparison.
These additions allow for a more transparent evaluation of the trade-off between model capacity, computational cost, and performance. We believe this addresses your concern. The corresponding results have been added and discussed in the revised manuscript.
Comment2: It is unclear whether gains arise from (i) two-branch decomposition, (ii) nonnegativity (ReLU), (iii) fixed subtraction, or simply optimization/capacity differences. Suggest ablation test (e.g Two-branch model without nonnegativity (no ReLU; signed branch outputs), Two-branch model with learned combination (replace fixed subtraction with a learned linear mixing layer))
Response2: Thank you for your insightful comment regarding the lack of a systematic ablation study for the single-direction design element. We acknowledge that an ablation study would provide a more rigorous analysis of this component's specific contribution.
To address this valuable suggestion, we have now conducted additional experiments that systematically isolate and evaluate the single-direction design by comparing it with:
- A variant that replaces the ReLU function of the output layer as Linear, which is named as LinearNet,
- A variant named as SharedNet in which the feature extraction layers are shared within the bra
The results, which we have added to Section 3.3 of the revised manuscript, confirm that the single-direction design contributes to the high generalizability of different targets.
We believe these additions substantially address your concern and thank you for helping us improve the rigor of our work.
Comment3: The manuscript uses a pseudo-population (non-simultaneous neurons) and resampling/trajectory similarity procedures, plus trial-averaging and reference trajectory computation. Some steps are described as using “all available motion data,” which risks train–test leakage under cross-validation.  Explicitly document the fold-wise pipeline (what is computed on training data vs test data).
Response3: Thank you for raising this important concern regarding potential train–test leakage. We agree that explicitly documenting the fold-wise pipeline is essential for ensuring rigorous cross-validation.
In the original manuscript, we used the term "all available motion data" to describe the reference trajectory computation. However, we acknowledge that this description was ambiguous and could imply data leakage. In practice, we computed a reference hand trajectory for each target using all available motion data recorded in each session. Trials that exhibited large deviations from this reference within the same day were then discarded. Our cross-validation procedure was strictly fold-wise, performed across sessions rather than trials. The same principle applied to the pseudo-population construction and resampling procedures. To address your concern, we have now revised the manuscript to explicitly document the fold-wise pipeline. Specifically:
- Reference trajectory computation: All the trajectory computation was done independently in each session
- Cross-validation splits:We clarify that cross-validation was performed at the session level, not the trial level. That is, all trials from entire recording sessions were assigned exclusively to either training or test sets.
- To give a better understanding of data acquisition and dataset construction procedures, we add a new Figure 2 inside the manuscript.
We appreciate your thorough review, which has helped us significantly improve the methodological clarity of our manuscript.
Comment4: Independent t-tests across CV folds are not well-justified (folds are paired across models and not independent samples), and multiple comparisons are not controlled. Use paired statistical tests across folds (paired t-test or Wilcoxon signed-rank). Apply multiple-comparison correction (Holm or FDR) across outputs/conditions.
Response4: Thank you for this important methodological comment. We agree that independent t‑tests are not appropriate for comparing models across cross‑validation folds, as the folds are paired (the same folds are used for both models). Additionally, we acknowledge that multiple comparisons across outputs or conditions should be properly controlled.
To address these concerns, we have revised our statistical analysis as follows:
- Independence Assumption:We acknowledge that results from five-fold cross-validation are not strictly independent, as the folds are derived from the same dataset and share overlapping training data. In the revised manuscript, we have replaced the independent t-tests with two-tailed paired t-tests to account for the paired nature of the comparisons across folds. We have also added a justification for our choice of statistical test in the Methods section.
- Multiple Comparisons Correction:As you recommended, we have implemented a complete statistical testing procedure. Specifically, when conducting multiple comparisons (e.g., across different time bins, frequency bands, or neural populations), we now apply the Benjamini-Hochberg False Discovery Rate (FDR) correction to control for Type I errors. The significance threshold is set at FDR-adjusted *p* < 0.05.
We have detailed these updated statistical procedures in Section2.2 (Page 6, Lines 160) and reverse the caption under Figure 3 and Figure 4 (Figure 4 and Figure 5 rightnow).
We believe these additions have substantially improved the rigor and credibility of our study. Thank you for helping us enhance this work.
Comment5: Clamping negative R² to zero (noted in a figure caption) biases metrics if used for inference. Do not clamp negative R² to zero for inferential statistics; if clamped for visualization, still report unclamped values in text/tables and use those for tests.
Response5: Thank you for this critical methodological observation. We agree that clamping negative R² values to zero biases the metric and is inappropriate for inferential statistics, as it artificially inflates performance estimates and distorts comparisons.
In the original manuscript, we applied clamping only for visualization purposes (as noted in the figure caption) to avoid displaying negative R² values that are conceptually difficult to interpret in the context of the figure. However, we acknowledge that this practice was not clearly distinguished from the values used for statistical testing.
To address your concern, we have made the following revisions:
- Unclamped values for inference: All statistical tests and reported summary statistics (means, standard deviations, etc.) in the text and tables now use unclamped R² values, including negative values where they occur.
- Visualization only: Clamping to zero is now applied exclusively for visualization in the figures, and this is explicitly stated in the caption with a clear note that unclamped values are used for all inferential statistics.
These revisions ensure that the inferential statistics are not biased by clamping while maintaining the visual clarity of the figures. We believe this addresses your concern and strengthens the rigor of our analysis. Thank you again for this valuable feedback.
Comment6: Torque prediction is an important outcome, yet the inverse dynamics/manipulandum parameters, units, filtering, and differentiation details are incomplete.
Response6: Thank you for this important observation. We agree that torque prediction is a critical outcome measure, and the technical details regarding inverse dynamics, manipulandum parameters, units, filtering, and differentiation must be fully documented to ensure reproducibility.
In the original manuscript, these details were either partially described or omitted. To address this concern, we have now substantially expanded the methodology section to provide a complete and transparent account of the torque computation pipeline. Specifically, we have added the following information:
- We now explicitly provide the full set of manipulandum and monkey’s arm parameters, including mass and segment length.
- We now describe the filtering procedures in detail and specify the method used (e.g., central difference algorithm).
We believe these revisions provide the necessary details for reproducibility and address your concern fully. Thank you again for this valuable feedback, which has helped us significantly improve the clarity and completeness of our methodology.
In addition, we have carefully reviewed the suggested papers and find that they offer valuable insights into BCI decoding with neural network and the development or challenge about brain activity decoding. Where appropriate, we have incorporated these works into our revised manuscript to strengthen the relevant discussions and contextualize our findings within the broader literature.
- Electroencephalogram-Based Motor Imagery Signals Classification Using a Multi-Branch Convolutional Neural Network Model with Attention Blocks:
- EEG Channel Selection Techniques in Motor Imagery Applications: A Review and New Perspectives
- A Review of Brain Activity and EEG-Based Brain–Computer Interfaces for
Rehabilitation Application
Thank you again for your thoughtful suggestion and support.
Author Response File:
Author Response.pdf
Round 2
Reviewer 1 Report
Comments and Suggestions for AuthorsThe authors have partially addressed the reproducibility concern by providing layer-wise information such as filter numbers and kernel sizes. However, the current level of detail remains insufficient for full reproducibility.
In particular, several critical architectural and training parameters are still missing or unclear. The manuscript should explicitly report:
Stride and filter configurations for all convolutional layers
Padding strategy (e.g., same/valid) in a standardized manner
Input dimensionality and feature representation before the LSTM
Detailed LSTM input/output structure (sequence length, reshaping strategy)
Training-related parameters (optimizer, learning rate, batch size, epochs, loss function)
Data preprocessing pipeline (normalization, segmentation/windowing)
Without these details, it is not possible for a researcher to reliably reimplement the proposed model.
Author Response
Response Letter for Bioengineering-4165920
Dear Reviewer,
Thank you for your careful evaluation of our manuscript and for raising important concerns regarding the reproducibility of our proposed model. We sincerely appreciate these constructive comments, which have helped us improve the clarity and completeness of our work.
In response to your feedback, we acknowledge that the original manuscript did not provide sufficient detail to ensure full reproducibility. We have now thoroughly revised the manuscript to include all missing architectural and training specifications. Specifically, we have added the following details:
- Convolutional Layers
We now explicitly report the stride and filter configurations for all convolutional layers, including kernel sizes, number of filters, and stride values for each layer. - Padding Strategy
The padding scheme (e.g., “same” or “valid”) used in each convolutional layer has been clearly specified in a standardized manner. - Input Representation to LSTM
We have clarified the input dimensionality and feature representation before the LSTM module (the input dimensionality is the same as the output of the previous layer), including how features are extracted and arranged. - LSTM Structure and Reshaping Strategy
Detailed information regarding the LSTM input-output structure has been added, including sequence length, feature dimensions, and how the pooling and flatten layer reshape the data to bridge convolutional outputs and sequential inputs.
- Training Configuration
All key training parameters are now included, such as optimizer type, learning rate, batch size, number of training epochs, and the loss function used.
- Data Preprocessing Pipeline
We have provided a comprehensive description of the preprocessing steps, including normalization methods, segmentation/windowing procedures, and any additional data preparation techniques. More details about data preprocesses could be seen in Section 2.2 (From Line 129)
These additions have been incorporated into the revised manuscript (see Section 2.7 and Table 2), ensuring that the proposed model can now be reliably reproduced by other researchers.
We are grateful for your insightful comments, which have significantly improved the rigor and transparency of our work. We hope that the revised manuscript meets your expectations for reproducibility.
Thank you for your consideration.
Sincerely,
The authors of Bioengineering-4165920
Author Response File:
Author Response.pdf
Reviewer 3 Report
Comments and Suggestions for AuthorsAll comments are addressed.
Author Response
Dear reviewer,
Thank you for your consideration. We hope that the revised manuscript meets your expectations.
Sincerely,
The authors of Bioengineering-4165920
Round 3
Reviewer 1 Report
Comments and Suggestions for AuthorsThe authors have completely addressed all my comments, and I have no further concerns. Therefore, I recommend accepting the paper.
