Learning Autonomy and Group Cohesion in Clinical Simulation: A Quasi-Experimental Comparison of Two Training Approaches
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsDear Authors,
Thank you for submitting your manuscript "Learning Autonomy and Group Cohesion in Clinical Simulation: A Comparison of Two Training Approaches" to Nursing Reports. The study addresses a topic of clear relevance for nursing education and the development of non-technical competencies through clinical simulation. The comparison between a self-directed methodology (MAES©) and a traditional instructor-led approach (SBL) is timely, and the use of a validated short version of the GEQ in the simulation context is a notable strength. The manuscript is generally well structured, follows the TREND statement, and uses an appropriate analytic strategy (ANCOVA with baseline adjustment, complemented by within-group nonparametric tests and effect sizes).
That said, I have identified several methodological, statistical, and reporting issues that should be addressed before the manuscript can be considered for publication. My main comments are summarized below.
Major comments
- Sample size justification and a priori power analysis. No information is provided regarding sample size calculation. Given that the study reports a relatively large sample (N = 311; MAES© = 188, SBL = 123), please justify the sample size a priori (expected effect size, alpha, power, anticipated dropout). If the sample was determined by feasibility (all eligible students during the study period), this should be stated explicitly, together with a post-hoc consideration of the minimum detectable effect.
- Group imbalance and selection bias. The substantial imbalance in province of origin (Murcia: 93.94% in SBL vs. 56.38% in MAES©; p < 0.001) is acknowledged but not sufficiently addressed analytically. Beyond a narrative caveat, please:
- Discuss whether "province of origin" could be a confounder for group cohesion (e.g., through shared social ties, prior acquaintance among classmates, or differences in cohort composition).
- Consider including province of origin (and any other potentially relevant baseline covariate) as an additional covariate in the ANCOVA model as a sensitivity analysis, and report whether the conclusions remain stable.
- Clarify whether students within each university were already organized in stable working groups before the intervention, since pre-existing group composition is a critical determinant of cohesion.
- Unit of analysis and clustering. Group cohesion is, by definition, a group-level construct, but the analysis appears to be performed at the individual student level. Students were trained in groups of 2–3 within each methodology, which implies clustering of observations. The lack of adjustment for the nested structure (students within simulation groups within universities) may inflate the precision of estimates and underestimate standard errors. Please:
- Clarify the unit of analysis explicitly.
- Consider a multilevel/mixed-effects approach (random intercepts for simulation group and/or university) as a sensitivity analysis, or at minimum discuss this limitation in greater depth.
- ANCOVA assumptions. The use of ANCOVA is appropriate in principle, but the manuscript does not report verification of its underlying assumptions:
- Linearity between the covariate (baseline GEQ score) and the post-intervention score within each group.
- Homogeneity of regression slopes (group × covariate interaction).
- Normality and homoscedasticity of residuals.
- Independence of observations (see point 3).
Please report these checks. If assumptions are not met, consider robust alternatives (e.g., rank-based ANCOVA, bootstrapped confidence intervals, or generalized linear models with appropriate link functions). Additionally, the use of ANCOVA on Likert-type ordinal data with three-item subscale sums (theoretical range 3–15) deserves brief methodological justification.
- Inconsistency between primary and secondary analyses. The primary analysis is ANCOVA (parametric), while the secondary within-group analysis uses Wilcoxon signed-rank tests (nonparametric), suggesting non-normality of the outcome. This inconsistency should be explained. If the GEQ scores are not normally distributed (which the choice of Wilcoxon suggests), the rationale for using parametric ANCOVA as the primary test should be made explicit, and the robustness of the ANCOVA result should be examined.
- Causal language in the Discussion. The Abstract and Conclusions are appropriately cautious, but several passages in the Discussion drift toward causal language inconsistent with the quasi-experimental design. For example:
- "the improvement in group cohesion is due to the intervention" (p. 7–8)
- "the MAES© methodology enhances both task-oriented cohesion and social cohesion" (p. 8)
- "implementing MAES© could significantly contribute to training competent, resilient, and cohesive teams" (p. 9)
Please revise these statements to consistently use associational language (e.g., "was associated with," "may contribute to") throughout the Discussion and Conclusions, in line with the TREND framework and the limitations the authors themselves acknowledge.
- Confidence intervals. Effect sizes (ηp² and Rosenthal's r) are reported, but no 95% confidence intervals are provided. Please add CIs for both ANCOVA effect sizes and within-group effect sizes, as this would substantially strengthen the interpretability of the magnitude of the observed differences.
- Description of the intervention (fidelity). The manuscript states that instructors received training and that adherence was monitored, but does not describe:
- The content and duration of instructor training in each methodology.
- The specific procedures used to monitor implementation fidelity (e.g., checklists, video review, observer ratings).
- Whether instructors were the same across both groups or different (a potential source of confounding).
- Whether the four sessions were identical in content between groups, or whether scenario content differed.
Please expand the Methods section accordingly, ideally with a brief structured description following an appropriate framework (e.g., TIDieR).
- Missing data. No information is provided regarding missing data, dropouts between the pre- and post-intervention assessments, or how non-completers were handled. Please report:
- The number of participants assessed at each time point.
- A flow diagram (recommended by TREND).
- The handling of missing data (complete-case, imputation, etc.).
- Ceiling effects. Median post-intervention scores in the MAES© group reach 13–14 out of a maximum of 15 in several dimensions, with reduced IQRs. This raises the possibility of a ceiling effect that may attenuate detectable changes and inflate apparent group differences. Please discuss this possibility.
Minor comments
- Title and Abstract. The title could more accurately reflect the design ("A Quasi-Experimental Comparison..."). In the Abstract, consider reporting the actual ANCOVA effect sizes (ηp²) and a brief statement of sample size in the Methods subsection.
- Keywords. Consider replacing "quasi-experimental study" with a MeSH term such as "Non-Randomized Controlled Trials as Topic," and adding "Self-directed learning" and "Interprofessional education" if applicable.
- Introduction. The Introduction is comprehensive but somewhat lengthy. Consider tightening sections that repeat the rationale for clinical simulation in successive paragraphs (e.g., p. 2, lines 78–100 and p. 3, lines 92–100), and sharpening the research gap that motivates this specific study.
- Table 2. The "000" entry under the p-value column for province of origin appears to be a typographical artifact and should be removed. Also, please report exact p-values where possible (e.g., p < 0.001) and provide test statistics (chi-square value, U value) for each comparison.
- Table 3. Please consider also reporting means (SD) alongside medians (IQR) to facilitate comparability with the ANCOVA results, which are based on means.
- Table 4. Please add degrees of freedom for the F-statistic, the adjusted post-intervention means with 95% CIs for each group, and the mean difference between groups (with 95% CI). This would make the magnitude and direction of the effect more transparent.
- Table 5. The notation in column headers is inconsistent (e.g., "ATG-S_1 – ATG-S_2", "ATG-T 1 – ATG-T 2"); please standardize. Also clarify what the asterisk on the Z values denotes within the footnote in a more visible way.
- References.
- Several entries display ambiguous diacritics or formatting issues (e.g., "García-Álvarez" appears as "García-Á lvarez"; "Ölnes" as "Ø lnes"). Please verify all references.
- Reference 33 (García-Álvarez et al., Eur. J. Investig. Health Psychol. Educ. 2026) appears to address a closely related topic by the same author group; please clarify the relationship and any overlap with the current manuscript to rule out duplicate publication concerns.
- Reference 19 (the validation of the GEQ short version) is also authored by the same team. Self-citation is justified here, but please ensure that the overall number of self-citations remains proportionate.
- Ethics and consent. Please specify whether the simulation sessions were part of the regular curriculum and, if so, how voluntariness of participation in the research (rather than in the teaching activity itself) was guaranteed, given the potential power asymmetry between students and instructors/researchers.
- Conclusions. The Conclusions section partially restates the Discussion. Consider tightening it to focus on (i) what the study adds, (ii) the appropriately cautious interpretation given the design, and (iii) clear directions for future research.
- Language. The English is generally clear and understandable, but the manuscript would benefit from a careful language revision to address minor issues (article use, occasional awkward phrasing, and repetition of "associated with" in close proximity throughout the Discussion).
In summary, the study addresses a relevant question with an appropriate analytic framework, but the manuscript requires substantial revisions concerning: (i) sample size and power justification, (ii) clearer handling of baseline imbalance and potential clustering, (iii) verification of statistical assumptions, (iv) consistent associational language in line with the non-randomized design, (v) more detailed description of the intervention and fidelity monitoring, and (vi) reporting of missing data and confidence intervals. With these improvements, the manuscript could provide a valuable contribution to the literature on simulation-based nursing education.
I thank the authors for the opportunity to review this work and wish them success in the revision process.
Sincerely,
The Reviewer
Author Response
Reviewer 1
We are grateful to the reviewer for the thorough assessment of our manuscript and for the constructive comments provided. The observations and recommendations have been extremely helpful in refining the manuscript and improving its scientific rigor, transparency, and presentation. We have carefully considered each point raised and provide detailed responses below, together with a description of the corresponding changes made in the revised manuscript.
Comments 1: Sample size justification and a priori power analysis.
No information is provided regarding sample size calculation. Given that the study reports a relatively large sample (N = 311; MAES© = 188, SBL = 123), please justify the sample size a priori (expected effect size, alpha, power, anticipated dropout). If the sample was determined by feasibility (all eligible students during the study period), this should be stated explicitly, together with a post-hoc consideration of the minimum detectable effect.
Response 1:
We appreciate the reviewer’s comment. No a priori sample size calculation was performed because the study was designed using a convenience sample that included all eligible final-year nursing students available during the study period. This has now been explicitly stated in the Methods section.
To provide additional information regarding the adequacy of the final sample, we conducted a post-hoc sensitivity analysis based on the achieved sample size (N = 311). The analysis indicated that the study had approximately 80% power to detect small-to-moderate effects (ηp² ≥ 0.03) at α = 0.05.
We recognize that post-hoc power analyses do not substitute for an a priori sample size calculation and should be interpreted cautiously. For this reason, we present these results only as supplementary information to help readers contextualize the precision of the study estimates. This limitation has now been acknowledged in both the Methods and Discussion sections.
Comments 2: Group imbalance and selection bias.
The substantial imbalance in province of origin (Murcia: 93.94% in SBL vs. 56.38% in MAES©; p < 0.001) is acknowledged but not sufficiently addressed analytically. Beyond a narrative caveat, please:
- Discuss whether "province of origin" could be a confounder for group cohesion (e.g., through shared social ties, prior acquaintance among classmates, or differences in cohort composition).
- Consider including province of origin (and any other potentially relevant baseline covariate) as an additional covariate in the ANCOVA model as a sensitivity analysis, and report whether the conclusions remain stable.
- Clarify whether students within each university were already organized in stable working groups before the intervention, since pre-existing group composition is a critical determinant of cohesion.
Response 2:
We appreciate this important observation. We agree that province of origin may be related to pre-existing social networks, prior familiarity among students, and differences in cohort composition, all of which could potentially influence perceived group cohesion independently of the educational methodology.
To explore this possibility, we conducted a sensitivity ANCOVA including province of origin as an additional covariate. The overall pattern of results remained unchanged, with similar effect sizes and levels of statistical significance across all dimensions of group cohesion.
We have also clarified that students participated in previously established academic simulation groups within their respective institutions. Although this reflects the natural educational setting in which the intervention was implemented, it may have influenced cohesion outcomes and is now discussed as a potential source of selection bias and residual confounding.
Comments 3: Unit of analysis and clustering.
Group cohesion is, by definition, a group-level construct, but the analysis appears to be performed at the individual student level. Students were trained in groups of 2–3 within each methodology, which implies clustering of observations. The lack of adjustment for the nested structure (students within simulation groups within universities) may inflate the precision of estimates and underestimate standard errors. Please:
- Clarify the unit of analysis explicitly.
- Consider a multilevel/mixed-effects approach (random intercepts for simulation group and/or university) as a sensitivity analysis, or at minimum discuss this limitation in greater depth.
Response 3:
We agree that this issue deserves careful consideration. Although group cohesion is theoretically a group-level construct, the GEQ measures each participant’s individual perception of cohesion. For this reason, the primary analyses were conducted at the individual level.
We considered multilevel modelling; however, simulation groups were very small (typically two to three students), which limited the stability and interpretability of random-effects models. Consequently, ANCOVA was retained as the main analytical approach.
Nevertheless, we acknowledge that the nested structure of the data may have led to underestimated standard errors and inflated precision. This limitation is now discussed more explicitly in both the Methods and Discussion sections.
Comments 4: ANCOVA assumptions.
The use of ANCOVA is appropriate in principle, but the manuscript does not report verification of its underlying assumptions:
- Linearity between the covariate (baseline GEQ score) and the post-intervention score within each group.
- Homogeneity of regression slopes (group × covariate interaction).
- Normality and homoscedasticity of residuals.
- Independence of observations (see point 3).
Please report these checks. If assumptions are not met, consider robust alternatives (e.g., rank-based ANCOVA, bootstrapped confidence intervals, or generalized linear models with appropriate link functions). Additionally, the use of ANCOVA on Likert-type ordinal data with three-item subscale sums (theoretical range 3–15) deserves brief methodological justification.
Response 4:
ANCOVA assumptions were carefully assessed and are now clearly described in the revised Methods section. Linearity, homogeneity of regression slopes, normality, and homoscedasticity of residuals were evaluated using both statistical tests and visual inspection, with no relevant violations detected.
We acknowledge that independence of observations may be partially violated due to clustering; this has been explicitly stated as a limitation.
GEQ subscale scores were treated as approximately continuous, consistent with common practice for composite Likert-type measures. Sensitivity analyses produced results that were consistent with the primary ANCOVA models, supporting the stability of the observed between-group differences.
Comments 5: Inconsistency between primary and secondary analyses.
The primary analysis is ANCOVA (parametric), while the secondary within-group analysis uses Wilcoxon signed-rank tests (nonparametric), suggesting non-normality of the outcome. This inconsistency should be explained. If the GEQ scores are not normally distributed (which the choice of Wilcoxon suggests), the rationale for using parametric ANCOVA as the primary test should be made explicit, and the robustness of the ANCOVA result should be examined.
Response 5:
We appreciate this observation. ANCOVA was used as the primary analysis because it allows adjustment for baseline differences and is appropriate for pre–post controlled designs. It is also considered robust to moderate deviations from normality in moderately large samples.
The Wilcoxon signed-rank test was used as a complementary within-group analysis due to slight deviations from normality and the ordinal nature of the measurement scale.
Both approaches produced consistent results, which has been clarified in the revised manuscript.
Comments 6: Causal language in the Discussion.
The Abstract and Conclusions are appropriately cautious, but several passages in the Discussion drift toward causal language inconsistent with the quasi-experimental design. For example:
- "the improvement in group cohesion is due to the intervention" (p. 7–8)
- "the MAES© methodology enhances both task-oriented cohesion and social cohesion" (p. 8)
- "implementing MAES© could significantly contribute to training competent, resilient, and cohesive teams" (p. 9)
Please revise these statements to consistently use associational language (e.g., "was associated with," "may contribute to") throughout the Discussion and Conclusions, in line with the TREND framework and the limitations the authors themselves acknowledge.
Response 6:
We thank the reviewer for highlighting this issue. To ensure consistency with the quasi-experimental design and the limitations inherent to non-randomized studies, we carefully reviewed the manuscript and replaced causal wording with associative language throughout the Abstract, Results, Discussion, and Conclusions.
Comments 7: Confidence intervals.
Effect sizes (ηp² and Rosenthal's r) are reported, but no 95% confidence intervals are provided. Please add CIs for both ANCOVA effect sizes and within-group effect sizes, as this would substantially strengthen the interpretability of the magnitude of the observed differences.
Response 7:
We agree that confidence intervals provide important information regarding the precision and practical significance of the observed effects. Accordingly, 95% confidence intervals have been added for all partial eta-squared (ηp²) values and for within-group effect sizes (Rosenthal’s r). These are now included in Tables 4 and 5.
Comments 8: Description of the intervention (fidelity).
The manuscript states that instructors received training and that adherence was monitored, but does not describe:
- The content and duration of instructor training in each methodology.
- The specific procedures used to monitor implementation fidelity (e.g., checklists, video review, observer ratings).
- Whether instructors were the same across both groups or different (a potential source of confounding).
- Whether the four sessions were identical in content between groups, or whether scenario content differed.
Please expand the Methods section accordingly, ideally with a brief structured description following an appropriate framework (e.g., TIDieR).
Response 8:
We appreciate this comment and agree that intervention fidelity is particularly important in simulation-based educational research.
We have expanded the Methods section to clarify that both groups worked on the same thematic content, learning objectives, and session duration, while differing in the instructional approach used during the simulation process. Faculty members received training specific to the methodology implemented in their institution and participated in coordination meetings to promote consistency across sessions.
No formal fidelity assessment procedures, such as structured checklists, independent observers, or video-based ratings, were implemented. We therefore acknowledge that fidelity was supported through standardization procedures rather than formally measured, and this limitation has now been explicitly discussed.
Comments 9: Missing data.
No information is provided regarding missing data, dropouts between the pre- and post-intervention assessments, or how non-completers were handled. Please report:
- The number of participants assessed at each time point.
- A flow diagram (recommended by TREND).
- The handling of missing data (complete-case, imputation, etc.).
Response 9:
All eligible students (N = 311) completed both the baseline and post-intervention assessments. Consequently, no missing data or participant attrition occurred during the study, allowing all analyses to be conducted using complete cases.
A participant flow diagram has been added in accordance with TREND guidelines.
Comments 10: Ceiling effects.
Median post-intervention scores in the MAES© group reach 13–14 out of a maximum of 15 in several dimensions, with reduced IQRs. This raises the possibility of a ceiling effect that may attenuate detectable changes and inflate apparent group differences. Please discuss this possibility.
Response 10:
We agree that the high post-intervention scores observed in several MAES© dimensions may indicate a ceiling effect. Consequently, the instrument may have had limited capacity to detect additional improvements among participants who already reported very high levels of cohesion. This issue is now acknowledged as a potential measurement limitation in the Discussion.
Comments 11: Title and Abstract.
The title could more accurately reflect the design ("A Quasi-Experimental Comparison..."). In the Abstract, consider reporting the actual ANCOVA effect sizes (ηp²) and a brief statement of sample size in the Methods subsection.
Response 11:
Following the reviewer’s suggestion, the title now explicitly identifies the study as quasi-experimental. We also added the sample size to the Methods subsection of the Abstract and included the main ANCOVA effect sizes (ηp²) in the Results subsection.
Comments 12: Keywords.
Consider replacing "quasi-experimental study" with a MeSH term such as "Non-Randomized Controlled Trials as Topic," and adding "Self-directed learning" and "Interprofessional education" if applicable.
Response 12:
Keywords have been revised to improve indexing. “Self-directed learning” was added, and terminology related to study design was updated to align with standardized indexing terms.
Comments 13: Introduction.
The Introduction is comprehensive but somewhat lengthy. Consider tightening sections that repeat the rationale for clinical simulation in successive paragraphs (e.g., p. 2, lines 78–100 and p. 3, lines 92–100), and sharpening the research gap that motivates this specific study.
Response 13:
To improve readability, we condensed repetitive sections describing clinical simulation and strengthened the presentation of the research gap at the end of the Introduction.
Comments 14: Table 2.
The "000" entry under the p-value column for province of origin appears to be a typographical artifact and should be removed. Also, please report exact p-values where possible (e.g., p < 0.001) and provide test statistics (chi-square value, U value) for each comparison.
Response 14:
Table 2 has been corrected to remove typographical errors, standardize formatting, and include exact test statistics and p-values where appropriate.
Comments 15: Table 3.
Please consider also reporting means (SD) alongside medians (IQR) to facilitate comparability with the ANCOVA results, which are based on means.
Response 15:
Means and standard deviations have been added to Table 3 alongside medians and interquartile ranges to facilitate comparison with ANCOVA results.
Comments 16: Table 4.
Please add degrees of freedom for the F-statistic, the adjusted post-intervention means with 95% CIs for each group, and the mean difference between groups (with 95% CI). This would make the magnitude and direction of the effect more transparent.
Response 16:
Table 4 has been updated to include degrees of freedom for the F statistics, adjusted post-intervention means with 95% confidence intervals for each group, and adjusted mean differences between groups with corresponding confidence intervals.
Comments 17: Table 5.
The notation in column headers is inconsistent (e.g., "ATG-S_1 – ATG-S_2", "ATG-T 1 – ATG-T 2"); please standardize. Also clarify what the asterisk on the Z values denotes within the footnote in a more visible way.
Response 17:
Table 5 has been revised to ensure consistent notation across variables. The meaning of the asterisk for Z values has been clarified in the table footnote.
Comments 18: References.
- Several entries display ambiguous diacritics or formatting issues (e.g., "García-Álvarez" appears as "García-Á lvarez"; "Ölnes" as "Ø lnes"). Please verify all references.
- Reference 33 (García-Álvarez et al., J. Investig. Health Psychol. Educ.2026) appears to address a closely related topic by the same author group; please clarify the relationship and any overlap with the current manuscript to rule out duplicate publication concerns.
- Reference 19 (the validation of the GEQ short version) is also authored by the same team. Self-citation is justified here, but please ensure that the overall number of self-citations remains proportionate.
Response 18:
We appreciate this observation. Reference 33 and the present study are based on different research questions and analytical objectives. The previous publication examined the association between clinical simulation and group cohesion in general, whereas the current manuscript specifically compares two simulation methodologies (MAES© and SBL) and evaluates their differential association with group cohesion outcomes.
Although both studies were conducted within the broader research line on clinical simulation and group cohesion, they use different analytical approaches and address distinct research objectives. No data, analyses, tables, or results have been duplicated between publications.
Comments 19: Ethics and consent.
Please specify whether the simulation sessions were part of the regular curriculum and, if so, how voluntariness of participation in the research (rather than in the teaching activity itself) was guaranteed, given the potential power asymmetry between students and instructors/researchers.
Response 19:
The Ethics Statement has been clarified. Simulation sessions were part of the regular curriculum; however, participation in the research component (questionnaires and data use) was voluntary, did not affect academic grading, and students could withdraw at any time without consequences.
Comments 20: Conclusions.
The Conclusions section partially restates the Discussion. Consider tightening it to focus on (i) what the study adds, (ii) the appropriately cautious interpretation given the design, and (iii) clear directions for future research.
Response 20:
The Conclusions section has been revised to reduce repetition of the Discussion and to focus on (1) main contribution, (2) cautious interpretation given the study design, and (3) future research directions.
Comments 21: Language.
The English is generally clear and understandable, but the manuscript would benefit from a careful language revision to address minor issues (article use, occasional awkward phrasing, and repetition of "associated with" in close proximity throughout the Discussion).
Response 21:
The manuscript has undergone a thorough language revision to improve clarity, consistency, and readability. Minor grammatical and stylistic issues have been corrected throughout.
We would like to thank the reviewer once again for the valuable feedback. Addressing these comments has allowed us to strengthen several aspects of the manuscript, including methodological reporting, interpretation of findings, and discussion of limitations. We believe that the revised version has benefited substantially from this review process.
Author Response File:
Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsDear authors and editors, this study is devoted to an important and relevant topic, the impact of various clinical modeling techniques on the cohesion of a group of nursing students in order to increase the effectiveness of actions. The manuscript is generally well structured, the statistical approach chosen is appropriate, and the results are potentially useful for teachers who train nurses. However, in a number of points it is noted that cause-effect relationships are overestimated:
- The distribution of participants into groups was based on university affiliation (UCAM = MAES©; UMU = SBL), and not on a random sample. This leads to significant selection bias, including potential differences in institutional culture, previous training, and faculty approach. The authors acknowledge this, but still use formulations suggesting causal superiority (for example, “giving preference to MAES©"). Consistently replace causal formulations (for example, “because of”, “as a result”) with associative formulations (for example, “related to", "showed significant improvements in").
- A significant difference in the initial data was found in the province of origin (p < 0.001), while the SBL group had much more participants from Murcia. ANCOVA did not take this into account. Include the province of origin as a parameter in the sensitivity analysis or justify its exclusion. Also, other non-measurable factors (for example, previous experience working in a team, personal qualities, characteristics of an instructor) are not discussed.
- According to the text of the manuscript, the authors emphasize the size of the intra-group effect (Table 5), but the main result should be intergroup differences adjusted for the baseline (Table 4). The discussion sometimes mixes these two factors, which can lead to a misinterpretation of intra-group changes as evidence of the superiority of MAES©.
- Change the structure of the discussion to first present the adjusted results obtained between the groups (ANCOVA), and then note that the improvements within the group were more significant in MAES©, but do not imply a causal relationship.
- In the abstract, the last sentence is very long and confusing. Try to break it down into two sentences.
- Several grammatical and punctuation errors are also detected (for example, missing spaces after dots, inconsistent use of “MAES©” instead of “MAES®").
- The sample consists entirely of fourth-year nursing students from two Spanish universities. The opportunity to expand to other places, years of study, or professions is limited. This should be clearly stated in the "Discussion" or "Limitations" section.
Minor revision is required.
The manuscript addresses a relevant and under-researched question, and the data are promising. However, the quasi-experimental design with allocation by university and significant baseline differences in province of origin substantially limit causal inference. The authors must revise the language throughout to reflect associative rather than causal claims, control for or discuss confounding variables more thoroughly, and restructure the Discussion to prioritize adjusted between-group results over within-group effect sizes.
Author Response
Reviewer 2
We are grateful to the reviewer for the thorough assessment of our manuscript and for the constructive comments provided. The observations and recommendations have been extremely helpful in refining the manuscript and improving its scientific rigor, transparency, and presentation. We have carefully considered each point raised and provide detailed responses below, together with a description of the corresponding changes made in the revised manuscript.
Comments 1:
The distribution of participants into groups was based on university affiliation (UCAM = MAES©; UMU = SBL), and not on a random sample. This leads to significant selection bias, including potential differences in institutional culture, previous training, and faculty approach. The authors acknowledge this, but still use formulations suggesting causal superiority (for example, “giving preference to MAES©"). Consistently replace causal formulations (for example, “because of”, “as a result”) with associative formulations (for example, “related to", "showed significant improvements in").
Response 1:
We agree that allocation by university introduces important limitations and potential institutional confounding. The manuscript has been revised to consistently avoid causal language and to adopt associative terminology throughout.
We have also expanded the limitations section to include institutional, pedagogical, and contextual factors that may have influenced outcomes.
Comments 2:
A significant difference in the initial data was found in the province of origin (p < 0.001), while the SBL group had much more participants from Murcia. ANCOVA did not take this into account. Include the province of origin as a parameter in the sensitivity analysis or justify its exclusion. Also, other non-measurable factors (for example, previous experience working in a team, personal qualities, characteristics of an instructor) are not discussed.
Response 2:
Following the reviewer’s recommendation, we conducted a sensitivity ANCOVA including province of origin as an additional covariate. The results remained essentially unchanged.
In addition, we acknowledge that other potentially relevant variables—including prior teamwork experience, personality characteristics, baseline motivation, and instructor-related factors—were not measured and therefore could not be controlled statistically. These sources of residual confounding are now discussed more explicitly in the limitations section.
Comments 3:
According to the text of the manuscript, the authors emphasize the size of the intra-group effect (Table 5), but the main result should be intergroup differences adjusted for the baseline (Table 4). The discussion sometimes mixes these two factors, which can lead to a misinterpretation of intra-group changes as evidence of the superiority of MAES©.
Response 3:
We agree that the primary interpretation should rely on adjusted between-group analyses. We have reorganized the Discussion so that the adjusted ANCOVA findings are presented first, while within-group analyses are clearly identified as secondary and exploratory.
Comments 4:
Change the structure of the discussion to first present the adjusted results obtained between the groups (ANCOVA), and then note that the improvements within the group were more significant in MAES©, but do not imply a causal relationship.
Response 4:
In accordance with this recommendation, the Discussion now follows a clearer structure in which adjusted between-group comparisons are presented first, followed by within-group findings that are interpreted as supportive rather than primary evidence.
Comments 5:
In the abstract, the last sentence is very long and confusing. Try to break it down into two sentences.
Response 5:
The final sentence of the Abstract has been split into two shorter sentences to improve clarity and readability.
Comments 6:
Several grammatical and punctuation errors are also detected (for example, missing spaces after dots, inconsistent use of “MAES©” instead of “MAES®").
Response 6:
The manuscript has been carefully revised to correct grammatical, punctuation, and formatting inconsistencies. Terminology has been standardized throughout.
Comments 7:
The sample consists entirely of fourth-year nursing students from two Spanish universities. The opportunity to expand to other places, years of study, or professions is limited. This should be clearly stated in the "Discussion" or "Limitations" section.
Response 7:
This limitation has been strengthened in the Discussion, emphasizing that findings are limited to fourth-year nursing students from two Spanish universities and may not generalize to other populations or contexts.
Comments 8:
The manuscript addresses a relevant and under-researched question, and the data are promising. However, the quasi-experimental design with allocation by university and significant baseline differences in province of origin substantially limit causal inference. The authors must revise the language throughout to reflect associative rather than causal claims, control for or discuss confounding variables more thoroughly, and restructure the Discussion to prioritize adjusted between-group results over within-group effect sizes.
Response 8:
We appreciate this overall assessment. In response, we revised the manuscript to ensure consistent use of associative language, expanded the discussion of potential confounding factors, and reorganized the Discussion to emphasize adjusted between-group comparisons as the primary findings.
We would like to thank the reviewer once again for the valuable feedback. Addressing these comments has allowed us to strengthen several aspects of the manuscript, including methodological reporting, interpretation of findings, and discussion of limitations. We believe that the revised version has benefited substantially from this review process.
Author Response File:
Author Response.pdf
Reviewer 3 Report
Comments and Suggestions for AuthorsThe manuscript addresses a relevant and timely topic in nursing education, exploring the influence of different clinical simulation methodologies on group cohesion among nursing students. The study presents a consistent conceptual framework supported by current and relevant literature, demonstrating an effort to contextualize the importance of non-technical skills, teamwork, and simulation-based learning in healthcare education. The overall structure of the manuscript is clear and logically organized, following a coherent progression between the introduction, methods, results, and discussion sections.
Nevertheless, the manuscript still presents some aspects that would benefit from further refinement prior to publication. First, although the introduction is comprehensive and well referenced, the scientific gap could be articulated more explicitly and critically at the end of the introduction, clarifying the novel contribution of this study in relation to previous evidence, particularly considering that some of the authors have already published studies on group cohesion and clinical simulation. Methodologically, the major limitation relates to the absence of randomization and the use of participants from different universities, which introduces multiple potential institutional, pedagogical, and cultural confounding factors that are not sufficiently explored in the discussion. The results section could also benefit from a more interpretative presentation rather than predominantly descriptive reporting, facilitating clearer identification of the most meaningful findings.
Author Response
Reviewer 3
We are grateful to the reviewer for the thorough assessment of our manuscript and for the constructive comments provided. The observations and recommendations have been extremely helpful in refining the manuscript and improving its scientific rigor, transparency, and presentation. We have carefully considered each point raised and provide detailed responses below, together with a description of the corresponding changes made in the revised manuscript.
Comments 1:
First, although the introduction is comprehensive and well referenced, the scientific gap could be articulated more explicitly and critically at the end of the introduction, clarifying the novel contribution of this study in relation to previous evidence, particularly considering that some of the authors have already published studies on group cohesion and clinical simulation.
Response 1:
The Introduction has been revised to strengthen the justification for the study and to clarify its novel contribution. Specifically, we now emphasize that previous studies have generally evaluated clinical simulation as a single educational strategy or have examined its association with group cohesion without comparing alternative instructional approaches.
The present study addresses this gap by directly comparing a self-directed simulation model (MAES©) with a more traditional instructor-led model (SBL), thereby exploring whether different levels of learner autonomy are associated with different cohesion outcomes.
Comments 2:
Methodologically, the major limitation relates to the absence of randomization and the use of participants from different universities, which introduces multiple potential institutional, pedagogical, and cultural confounding factors that are not sufficiently explored in the discussion.
Response 2:
We have expanded the Discussion to examine in greater depth the potential influence of institutional, pedagogical, and cultural factors associated with the use of two different universities. We also discuss the possible impact of unmeasured confounders and have removed causal language throughout the manuscript.
Comments 3:
The results section could also benefit from a more interpretative presentation rather than predominantly descriptive reporting, facilitating clearer identification of the most meaningful findings.
Response 3:
The Results section has been reorganized to emphasize the findings with the greatest relevance to the study objectives, particularly the adjusted between-group comparisons. Additional explanatory text has also been incorporated to facilitate interpretation while maintaining a clear distinction between results and discussion.
We would like to thank the reviewer once again for the valuable feedback. Addressing these comments has allowed us to strengthen several aspects of the manuscript, including methodological reporting, interpretation of findings, and discussion of limitations. We believe that the revised version has benefited substantially from this review process.
Author Response File:
Author Response.pdf
Round 2
Reviewer 1 Report
Comments and Suggestions for AuthorsThanks to the authors for incorporating the suggested changes, and congratulations on the final article.

