Effects of Audiovisual Interactions on Working Memory Task Performance—Interference or Facilitation
Round 1
Reviewer 1 Report
Comments and Suggestions for Authors
The study “Effects of Audiovisual Interactions on Working memory task performance – Interference or Facilitation” applied a combination of n-back and go/nogo tasks in the auditory and visual sensory modalities to investigate the effects of task load between the senses. Load was manipulated by applying 2-back or 3-back conditions in unisensory and multisensory (AV) conditions. For the low load conditions auditory stimuli had no effect on visual WM and visual stimuli purportedly had a small effect on auditory WM. However, for the high load conditions auditory stimuli had a large effect on visual WM while visual stimuli purportedly had a medium effect on auditory WM.
General comments:
It is not quite clear why a combination of tasks were applied. Why were n-back not used for both auditory and visual modalities? It is hard to draw the conclusions you draw, since you are comparing different tasks (WM and Response Inhibition) in some of the comparisons. The go/nogo task does not measure working memory as proclaimed. It measures the ability to inhibit responses, so these can’t be compared directly. Also, it is not sure if these results stem from audiovisual interactions as described or just from increased task load. The experimental design does not answer this.
Cognitive/attentional load is first introduced in the discussion, while this is quite relevant to the thesis of this paper. Please introduce this properly in the introduction.
Itemized comments:
P. 1 – The title has uneven capitalization.
P. 1 – The abstract is too detailed. No need to include the number and age of subjects here.
P. 1, L. 16 – the semicolon looks awkward here.
P. 1, L. 39 – Put “that includes…visuospatial sketchpad” in brackets for readability
P. 2, L. 61 – a multidisciplinary focus
P. 2, L. 64 – between (not among)
P. 2, L. 65– simultaneously (instead of at the same time)
P. 2, L. 69-75 - This seems a but handwavy but certainly these are the most used sensory modalities. Maybe make the point of visual dominance here that is introduced in the discussion
P. 2, L. 77-97 – The Temporal order Judgment literature between the senses may be relevant here e.g.
Zampini, M., Shore, D. I., & Spence, C. (2005). Audiovisual prior entry. Neuroscience Letters, 381, 217–222.
Vibell, J., Klinge, C., Zampini, M., Spence, C., & Nobre, A.C. (2007). Temporal order is coded temporally in the brain: Early ERP latency shifts underlying prior entry in a crossmodal temporal order judgment task. Journal of Cognitive Neuroscience, 19, 109-120.
P. 3, L. 113 – The above two effects illustrate that (many ppl found that before you)
P. 3, L. 119 – Remove ‘involving audiovisual interactions’
P. 3, L. 127 – don’t forget to add your hypothesis
P. 4, L. 150-151 – Stimuli were (plural x2)
P. 6, L. 239 – Go/nogo task does not measure working memory and should not be directly compared as such
P. 7, L. 275 – highly significant (not very)
P. 10, L. 318 – Not sure if its from audiovisual interactions or just increased task load
P. 10, L. 331 - ‘Results showed’ instead of ‘according to our results we showed that’
P. 10, L. 336 – clarify ‘inhibitory performance’
P. 10, L. 338 – ‘suggest’ instead of ‘fully indicate’, btw we already know fully well that vision and audition influence each other.
P. 12, L. 448 – rephrase ‘fully demonstrated’
P. 12, L. 452-454 – you did not discover this. Its called visual dominance and is a broad set of literature that probably should be discussed more fully
Sinnett, S., Spence, C. & Soto-Faraco, S. Visual dominance and attention: The Colavita effect revisited. Perception & Psychophysics 69, 673–686 (2007). https://doi.org/10.3758/BF03193770
P. 12, L. 458-460 – I am not sure I agree with these numbers as its difficult to quantify exactly, but they should at least be referenced
P. 12, L. 460-470 – This restates what you already said in the introduction.
Author Response
Dear editors and reviewers,
Thank you for your letter and for the reviewers’ comments concerning our manuscript entitled “Effects of Audiovisual Interactions on Working Memory Task Performance—Interference or Facilitation” (ID: brainsci-1756218). Those comments are all valuable and very helpful for revising and improving our paper, as well as the important guiding significance to our researches. We have studied comments carefully and have made correction which we hope meet with approval. At the same time, according to your request, we have downloaded a copy of the original manuscript with all the changes marked in red by using the track changes mode in MS Word.
We tried our best to improve the manuscript and made some modifications to it. These changes will not affect the content and framework of the paper. Here, we do not list the changes, but they are marked in red in the revised document.
We sincerely thank the editors and reviewers for their enthusiastic work and hope that the correction will be approved. More importantly, we hope that the revised manuscript will be accepted for publication by the Journal of Brain Sciences.
Finally, please refer to Attachment 1 for the revised version, see page 20-30 of Attachment 1 for the reviewer's comments, and see page 31 of Attachment 1 for the polishing certificate.
Once again, thank you very much for your comments and suggestions.
Thank you and best wishes
Sincerely yours
Yang He
Author Response File:
Author Response.docx
Reviewer 2 Report
Comments and Suggestions for AuthorsThis study explores the impact of audiovisual (AV) interactions on working memory (WM), employing a combined n-back + Go/NoGo paradigm to test resource competition under varying cognitive loads. The authors aim to determine whether concurrent audiovisual stimuli interfere with or facilitate WM, framed within established theoretical models such as visual dominance and resource competition theory. The work presents clear hypotheses, a sound behavioral paradigm, and appropriate statistical analyses.
While the manuscript is methodologically adequate and fits within the scope of the journal, there are weaknesses in task design and data interpretation. Nonetheless, the topic is timely and relevant, and the study has potential with specific improvements.
Major issues
- The auditory go-nogo task is not a working memory task. According to the authors’ description, there is no delay between stimulus presentation and the response window: the button press follows stimulus immediately, so there is no memory component. Consequently, I don’t think performance on the go-nogo task can be described as working memory in any of the tasks. The current task design is therefore asymmetric, the authors should reframe the study as a test of modality-specific interference during asymmetric dual-tasking rather than audiovisual working memory.
- It is unclear what some of the metrics represent. While “reaction time” is self-evident, the authors should clearly explain what accuracy and visual and auditory “performance” means and how they were calculated. Intuitively, accuracy and performance seems to be the same thing. There is an explanation for visual performance as the percentage of hits – false alarms, but this is not likely true for auditory tasks as the values are above 1. By the way, percentages go from 0-100, proportion would be the correct phrase for a metric on the 0-1 scale. There seems to be no explanation of the accuracy metric. Overall, performance might be better described by the discriminabitly score (d`), especially for go-nogo tasks.
- There are some “effects” that are statistically significant while representing a 1-2% difference between conditions, the authors should give serious consideration to the difference between statistically detectable differences and biologically meaningful results.
Minor comments
- Somewhat excessive literature review. The introduction section is very long and somewhat repetitive. I would recommend shortening this section, maybe breaking it up into shorter paragraphs and making the goals of the study more clear.
- Figure 1 and 2: visual 2-back and 3-back task should be noted in the figure, instead of n-back
- The correlation between reaction time and accuracy should also be displayed in a figure, rather than only presenting the numerical result (this would give the reader a better insight into the variance and goodness of fit).
- The authors refer to “poor visual working memory performance” in the 3-back + go-nogo task but it is unclear whether this is indeed “poor performance” or just worse than the 3-back visual alone. The authors should either present some sort of benchmark for strong vs poor performance or use relative terms in their descriptions.
- From a multisensory integration perspective, it is somewhat surprising to only see increased reaction time in the dual tasks. One would expect that when cognitive load is low (2-back dual task), on target+go trials one might expect lower reaction time due to the speed-up caused by congruent visual and auditory stimuli. This could be explored by splitting the already existing data by trial type.
- The manuscript should clarify which experiment is testing which theoretical framework (visual dominance, resource competition theory, and attentional load theory).
- All male, highly intelligent sample should be noted as population that is not easy to generalize from.
Round 2
Reviewer 1 Report
Comments and Suggestions for AuthorsThank you for addressing my all my comments thoughtfully. I am happy with all the comments. The two issues of comparing quite different tasks and quite different task loads remain. Those are fundamental design flaws that should be addressed in future studies. I am OK with that you just mention them in this study, but believe, in general, that they are important to address to really get to the bone on this topic.
