Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Transforming Teacher Knowledge to Practice: Exploring the Impact of a Professional Development Model on Teachers’ Literacy Instruction and Self-Efficacy

Educ. Sci. 2025, 15(9), 1230; https://doi.org/10.3390/educsci15091230

by Orly Lipka^*

, Adi Bufman, Shelley Shaul

and Tami Katzir

Reviewer 1:

Valentine Okwara

Reviewer 2:

Konstantinos Mastrothanasis

Reviewer 3:

Anika Saxena

Reviewer 4:

Aimee Quickfall

Educ. Sci. 2025, 15(9), 1230; https://doi.org/10.3390/educsci15091230

Submission received: 31 May 2025 / Revised: 10 September 2025 / Accepted: 12 September 2025 / Published: 16 September 2025

(This article belongs to the Section Teacher Education)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

Table 3 - The main column headings and sub-column headings should be formatted to make for easier readability. It is currently clustered together and may be confusing to the reader.

The homogeneity of the sample, 97.5% female, may limit the generalizability of the result. Discuss and address the sample homogeneity.

The authors did not report any objective measures of teacher knowledge or classroom practice.

The study's main focus on grade 2 - 3 literacy makes its applicability to other subjects and/or other grades unclear. For example, what is the study's transferability to STEM subjects and other domains?

The study would have benefited more from the inclusion of a control group or quasi-experimental refinements.

Comparison to teachers without P.D would have strengthened the outcomes of the study.

Sampling limitations may have included self-reported bias; the authors should state how this was mitigated.

Comments for author File: Comments.pdf

Author Response

Changes with accordance to R. 1 are marked in yellow

1. Reviewers' comment: Table 3 - The main column headings and sub-column headings should be formatted to make for easier readability. It is currently clustered together and may be confusing to the reader.

We thank the reviewer for this important observation regarding the clarity of Table 3's formatting. We have revised the table structure to improve readability by clearly separating the main column headers ("Pre-PD Knowledge Level," "Post-PD Knowledge Level," and "Statistical Analysis") and their respective subcolumns (M and SD). The hierarchical structure is now visually distinct, with proper spacing and alignment that eliminates the previous clustering issue. Additionally, we have updated the table title from "Comparisons Between Reports on Knowledge Levels Before and After Continuing Education at the Item Level" to the more precise and accessible "Teachers' Self-Reported Literacy Knowledge Gains Following Professional Development by Content Domain." These formatting improvements ensure that readers can easily follow the progression from knowledge areas to pre-post comparisons to statistical results.

2. Reviewers' comment: The homogeneity of the sample, 97.5% female, may limit the generalizability of the result. Discuss and address the sample homogeneity.

We acknowledge the reviewer's concern regarding the gender homogeneity of our sample (97.5% female). However, this composition reflects the demographic reality of the teaching profession both in Israel and internationally. Globally, women represent 94% of pre-primary education teachers and 66% of primary education teachers (UNESCO, 2020). In Israel specifically, women constitute 78% of primary teachers and 78.68% of lower secondary teachers (OECD, 2019; World Bank, 2018). While our sample's gender distribution limits generalizability to male teachers, it is highly representative of the population most likely to be involved in elementary literacy instruction. We have added this limitation to our discussion and recommend that future research intentionally recruit more male elementary teachers to examine potential gender differences in professional development outcomes, while noting that such recruitment may be challenging given the current demographic composition of the elementary teaching workforce.

References:

OECD. (2019). TALIS 2018 Results (Volume I): Teachers and School Leaders as Lifelong Learners. OECD Publishing.

UNESCO. (2020). 2020 GEM Report - Gender Report: A new generation: 25 years of efforts for gender equality in education. UNESCO Publishing.

3. Reviewer's comment: The authors did not report any objective measures of teacher knowledge or classroom practice.

We acknowledge this important limitation. Our study relied on teachers' self-reported perceptions rather than objective measures. However, research supports the validity of self-report measures in educational settings. Desimone et al. (2010) found teacher self-reports to be strongly correlated with classroom observations, and a recent meta-analysis of professional development studies found that self-report measures of teacher self-efficacy yielded strong effect sizes (g = 0.64, p < 0.01) across 21 studies with 1,412 teachers (Huang et al., 2023). Self-report measures provide valuable insights into teachers' confidence and perceived competence, which are important predictors of implementation behavior, though they may not fully capture actual knowledge gains or classroom practice quality. Future research should incorporate objective measures such as standardized knowledge assessments or classroom observation rubrics to provide a more comprehensive evaluation of professional development effectiveness.

References:

Desimone, L. M., Smith, T. M., & Frisvold, D. E. (2010). Survey measures of classroom instruction: Comparing student and teacher reports. Educational Policy, 24(2), 267-329.

Huang, X., Lee, J. C. K., & Dong, X. (2023). The effect of professional development on in-service STEM teachers' self-efficacy: A meta-analysis of experimental studies. International Journal of STEM Education, 10, 27.

4. Reviewer's comment: The study's main focus on grade 2 - 3 literacy makes its applicability to other subjects and/or other grades unclear. For example, what is the study's transferability to STEM subjects and other domains?

We acknowledge this limitation. Our focus on grades 2-3 literacy instruction limits generalizability to other subjects and grade levels, as elementary literacy has unique pedagogical characteristics that may differ from STEM subjects or secondary education contexts. The professional development needs and content knowledge requirements vary significantly across domains. Future research should examine whether our model's effectiveness extends to other subject areas and grade levels to establish broader applicability and identify necessary domain-specific adaptations.

5. Reviewer's comment: The study would have benefited more from the inclusion of a control group or quasi-experimental refinements.

We acknowledge this significant methodological limitation. However, rigorous experimental designs in professional development research may present a challenges. A recent meta-analysis of reading comprehension professional development identified only 29 experimental and quasi-experimental studies meeting inclusion criteria, with authors noting that "more systematic research is needed" (Rice et al., 2024). Similarly, only 21 controlled studies were identified from 6,365 STEM professional development studies (Huang et al., 2023). These findings highlight complexities including ethical concerns about withholding beneficial development, logistical challenges of randomization, and difficulty controlling contextual variables. Notably, our study examined different implementation levels (0-4 vs. 5-12 lessons) and their relationship to self-efficacy, effectively creating comparison groups that provide dose-response insights. While controlled designs would strengthen causal inferences, our pre-post design with implementation-level analysis provides valuable insights consistent with established field practices.

Reference

Rice, M., Lambright, K., & Wijekumar, K. (2024). Professional development in reading comprehension: A meta-analysis of the effects on teachers and students. Reading Research Quarterly, 59(3), 412-435.

6. Reviewer's comment: Sampling limitations may have included self-reported bias; the authors should state how this was mitigated.

The limitation section includes now all the comments regarding the sampling limitation of this study.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

The overall impression of this manuscript is positive, as it presents a well-structured and theoretically grounded study addressing an important area in teacher professional development for literacy instruction. The research clearly demonstrates the improvement in both teacher knowledge and self-efficacy, highlighting the importance of practical implementation of training tools in the classroom for strengthening professional confidence. The integration of theory and practice, the use of established models of knowledge (CK, PCK, and practical knowledge), and the reference to recent international literature, particularly regarding effective characteristics of professional development programs and teacher self-efficacy, are notable strengths of the work.

Nonetheless, there are several points that merit further attention in order to enhance the scientific rigor and depth of the manuscript. I strongly recommend the inclusion of a power analysis to substantiate whether the sample size (n = 82) is adequate for the statistical procedures employed and to support the validity of the findings. Furthermore, it would be beneficial to provide more detailed validity evidence for the research instruments beyond reporting reliability coefficients, so as to confirm the accurate measurement of the conceptual constructs involved. Since parametric statistical tests are used, it is also essential to assess and report on the normality of the data; if deviations from normality are detected, this should be explicitly acknowledged as a limitation of the study.

In terms of the results presentation, Tables 2 and 3 should be revised to fully comply with APA guidelines, featuring distinct columns for variables and statistical outcomes to facilitate clarity and readability. I also suggest that the research questions be treated as three distinct questions (not as 2a and 2b), and that this revised structure be mirrored in both the results and discussion sections for consistency and ease of navigation. Throughout the results, the statistical value p should be presented in italics, in accordance with international reporting standards.

Regarding the theoretical framing and discussion of findings, I believe that the manuscript would greatly benefit from a deeper engagement with more recent literature that can provide a richer and more nuanced perspective on teacher self-efficacy in literacy instruction and on the quality of professional development programs. I specifically recommend the incorporation of two recent publications, which can add both theoretical depth and practical insight:

First, the article "Professional development quality and instructional effectiveness: Testing the mediating role of teacher self-efficacy beliefs" (https://doi.org/10.1080/19415257.2023.2264309) is highly relevant and could be referenced primarily in the discussion. This publication elaborates on the role of teacher self-efficacy as a mediating factor between the quality of professional development and key indicators of instructional effectiveness. I recommend a formulation such as: "Recent findings demonstrate that teacher self-efficacy acts as a critical mediating variable between the quality of professional development programs and essential indicators of instructional effectiveness, such as clarity of instruction, cognitive activation, and classroom management, thus underscoring the theoretical relevance of the present study’s findings".

Second, the article "Drama-based methodologies and teachers’ self-efficacy in reading instruction" (https://doi.org/10.1080/03323315.2025.2479438) should be considered for integration within the theoretical framework, especially in the section discussing factors that enhance teacher self-efficacy. This work experimentally explores the impact of experiential and drama-based teaching approaches on teacher self-efficacy in reading instruction. Its findings offer comparative evidence on the effectiveness of alternative methodologies in strengthening teachers’ confidence. I suggest explicitly noting, for example: "Recent studies indicate that, beyond conventional methods, the adoption of experiential and drama-based teaching approaches significantly enhances teacher self-efficacy in literacy instruction, thus providing new perspectives for the design of professional development.". In the discussion section, these findings could be leveraged to further reinforce the argument that the results of the current study align with international empirical evidence on the importance of active and experiential practice for cultivating positive professional identity in teachers.

In sum, the manuscript represents a useful contribution to the field of teacher professional development in literacy education. The incorporation of the above suggestions will further strengthen its methodological and theoretical rigor, scientific validity, and overall scholarly contribution.

Author Response

Changes according to R.2 are marked in green

Reviewer's comment: There are several points that merit further attention in order to enhance the scientific rigor and depth of the manuscript. I strongly recommend the inclusion of a power analysis to substantiate whether the sample size (n = 82) is adequate for the statistical procedures employed and to support the validity of the findings.

We appreciate this important suggestion. In response, we have conducted an a priori power analysis to determine whether the sample size was sufficient for the main statistical tests employed (e.g., paired-sample t-tests examining pre- and post-PD differences and Anova bidirectional variance analyses with repeated measures examining examining pre- and post-PD differences and implementation levels). Based on a medium effect sizes (Cohen’s d = 0.5; f = 0.25), α = .05, and power = .95, the required sample size was calculated to be approximately 54 participants. Our final sample of 82 participants exceeded this threshold, indicating that the study was adequately powered to detect meaningful effects. This analysis and rationale have been added to the Methods section of the revised manuscript (see page 7). This effect size assumption is grounded in prior literature on teacher professional development, which has reported moderate gains in teacher knowledge and self-efficacy (e.g., Kraft et al., 2018; Tschannen-Moran & McMaster, 2009).

Reviewer's comment: Furthermore, it would be beneficial to provide more detailed validity evidence for the research instruments beyond reporting reliability coefficients, so as to confirm the accurate measurement of the conceptual constructs involved.

Thank you for this helpful suggestion. We agree that validity evidence is essential to support the appropriateness of our measurement instruments. In the revised manuscript, we have added a more detailed rationale for the construct validity of the two main instruments:

For the Teacher Self-Efficacy in Literacy Instruction (TSELI) scale, we’ve added a clarification that the instrument was originally developed by Tschannen-Moran & Johnson (2011) based on Bandura’s (1997) theoretical model and supported by factor analysis, which identified efficacy dimensions related to writing and oral reading instruction. We also describe the translation and adaptation process of the Hebrew version used in our study (Lipka, 2017), which involved back-translation and expert review by bilingual literacy researchers. These steps support both construct validity and response process validity (see pages 7-8).

For the teacher knowledge questionnaire, we added a rationale for the use of the retrospective pre–post format, which has been shown to reduce response-shift bias and enhance the accuracy of self-assessment in educational interventions (Howard et al., 1979; Bhanji et al., 2012). We further clarified that the items were aligned with Shulman’s (1986) framework and the specific content of the PD program, and that they were reviewed by literacy experts to support content validity. Both tools demonstrated strong internal consistency (α = .93–.95) (see page 8).

Reviewer's comment: Since parametric statistical tests are used, it is also essential to assess and report on the normality of the data; if deviations from normality are detected, this should be explicitly acknowledged as

Thank you for this valuable observation. In response, we conducted Shapiro-Wilk tests for all outcome variables, including pre- and post-scores for self-efficacy, teacher knowledge, and implementation subgroups. The results indicated that most raw scores significantly deviated from normality (p < .05), whereas knowledge change scores were approximately normally distributed. These findings are now reported in the manuscript and acknowledged as a limitation (see page 15). Despite these deviations, we proceeded with parametric analyses based on evidence that such methods are robust in large sample sizes typical of educational studies (e.g., Lumley et al., 2002).

Reviewer's comment: In terms of the results presentation, Tables 2 and 3 should be revised to fully comply with APA guidelines, featuring distinct columns for variables and statistical outcomes to facilitate clarity and readability.

was addressed.

Reviewer's comment: I also suggest that the research questions be treated as three distinct questions (not as 2a and 2b), and that this revised structure be mirrored in both the results and discussion sections for consistency and ease of navigation.

Thank you for this valuable suggestion. We have restructured the manuscript to present three distinct research questions instead of the previous 2a and 2b format. The research questions have been separated and renumbered throughout the entire study, including:

Methods section: Three distinct research questions are now presented. Results section: Reorganized with separate subsections for Research Question 1, Research Question 2, and Research Question 3. Discussion section: Restructured to address each of the three research questions independently

This revision provides improved consistency and ease of navigation throughout the manuscript while maintaining the integrity of our analytical approach.

Reviewer's comment: Throughout the results, the statistical value p should be presented in italics, in accordance with international reporting standards.

Reviwer's comment: First, the article "Professional development quality and instructional effectiveness: Testing the mediating role of teacher self-efficacy beliefs" (https://doi.org/10.1080/19415257.2023.2264309) is highly relevant and could be referenced primarily in the discussion. This publication elaborates on the role of teacher self-efficacy as a mediating factor between the quality of professional development and key indicators of instructional effectiveness. I recommend a formulation such as: "Recent findings demonstrate that teacher self-efficacy acts as a critical mediating variable between the quality of professional development programs and essential indicators of instructional effectiveness, such as clarity of instruction, cognitive activation, and classroom management, thus underscoring the theoretical relevance of the present study’s findings".

Thank you for directing our attention to the important work by Yoon and Goddard (2023). We have incorporated their findings throughout our manuscript to strengthen our theoretical framework and empirical support. Specifically, we added content to the "Teachers' Self-Efficacy in Literacy" and "Effective Professional Development" sections referencing their large-scale international validation (97,729 teachers across 45 countries) of self-efficacy as a mediator between PD quality and instructional effectiveness. Additionally, we enhanced our Discussion sections for Research Questions 2 and 3 by demonstrating how our Israeli literacy-focused findings align with their international cross-domain evidence, particularly regarding the mediating role of self-efficacy beliefs and domain-specific effects. These additions significantly strengthen our manuscript by providing robust international empirical support and positioning our findings within the broader global research landscape.

Reviewer 3 Report

Comments and Suggestions for Authors

The scholarly article "Transforming Teacher Knowledge to Practice: Exploring the Impact of a Professional Development Model on Teachers' Literacy Instruction and Self-Efficacy" investigates how a professional development (PD) initiative can augment educators' literacy instruction and self-efficacy. It underscores the pivotal significance of efficacious PD in enhancing educators' competencies and confidence, which are vital for catering to the diverse needs of students. The research encompassed 82 educators and evaluated the modifications in their knowledge and self-efficacy before and after participating in the program. The findings revealed marked enhancements, particularly among those who frequently applied PD strategies. The PD model amalgamated content knowledge, pedagogical content knowledge, and practical implementation, successfully bridging the gap between theory and practice. The outcomes imply that effective PD should concentrate on acquiring knowledge and its practical application in the classroom to bolster teacher self-efficacy and elevate student literacy, ultimately providing significant insights into successful PD methodologies. The suggestions for improvements are:

Introduction and Abstract

The introduction should be expanded to clearly articulate the gaps in the teacher professional development (PD) literature, especially regarding literacy and teacher self-efficacy. Emphasising the significance of these areas will provide essential context for readers, particularly those unfamiliar with educational research. Further, the abstract could benefit from a brief overview of the key components of the PD model and the analytical methods utilised. Including a note on the study’s limitations would give readers a more transparent snapshot of the research scope.

Literature Review and Theoretical Background

The literature review should include a more comprehensive discussion of Content Knowledge (CK) and Pedagogical Content Knowledge (PCK). Defining these concepts in accessible language and explaining their relevance to literacy teaching will enhance reader understanding. Furthermore, strengthening connections with existing studies through a comparative analysis of the current and previous PD models will situate this research within the broader literature. Using simple analogies can help beginner researchers grasp these relationships.

Methodology

Clarification is needed on the participant selection process and how the PD sessions were implemented. A detailed explanation of how teachers were categorised into small and large application groups will improve transparency. While the measures and tools employed (such as self-efficacy questionnaires and retrospective knowledge assessments) are well explained, a concise rationale for these choices concerning the study’s goals would enhance readability for novice readers.

Data Analysis and Results

An elaboration on the statistical techniques used, such as t-tests and ANOVA analyses, is essential. Describing what these tests reveal in layman’s terms (e.g., their role in determining the significance of changes in self-efficacy and knowledge) will assist less experienced readers in understanding the analysis. A more detailed discussion of the interaction effects, with relatable examples demonstrating how increased teaching frequency correlates with improved self-efficacy, would enhance clarity.

Discussion and Conclusion

The discussion section would benefit from succinct summaries of key findings, linking them back to the literature review. Providing practical examples of how improved literacy instruction affects student outcomes can increase the relevance of the findings. Concluding with specific recommendations for practitioners and scholars and suggestions for future research, such as exploring varying dosage levels of PD training, would greatly enrich the paper.

Overall Clarity and Accessibility

Consider simplifying technical terminology or providing brief definitions upon first mention. Enhancing readability through bullet-point summaries or subheadings within major sections will create a more precise roadmap for readers, improving overall engagement with the content.

Author Response

changes according to R.3 are marked in blue.

Reviewer's comment: The introduction should be expanded to clearly articulate the gaps in the teacher professional development (PD) literature, especially regarding literacy and teacher self-efficacy. Emphasising the significance of these areas will provide essential context for readers, particularly those unfamiliar with educational research.

Thank you for this valuable suggestion. We have expanded the introduction to clearly articulate the gaps in teacher professional development literature, particularly regarding literacy and self-efficacy. We added a comprehensive paragraph in the introduction that identifies three critical gaps in the current literature:

Fragmented research on literacy-specific professional development across grade levels
Limited systematic research on PD effects on teacher self-efficacy in literacy contexts (compared to STEM)
Insufficient research on the relationship between teacher self-efficacy, professional development practices, and implementation levels

This expansion incorporates recent systematic reviews and meta-analyses (Rice et al., 2024; Zhou et al., 2023; Larsen & Bradbury, 2024) and clearly positions our study as addressing these identified gaps.

Reviewer's comment: The abstract could benefit from a brief overview of the key components of the PD model and the analytical methods utilised. Including a note on the study’s limitations would give readers a more transparent snapshot of the research scope.

Thank you for this feedback. We have revised the abstract to include: Key PD model components (content knowledge, pedagogical content knowledge, and practical implementation), Analytical methods (t-tests and 2×2 ANOVA with 82 teachers) and study limitations (self-reported measures and homogeneous sample). These additions provide readers with a more transparent and complete snapshot of the research scope and methodology.

Reviewer's comment: The literature review should include a more comprehensive discussion of Content Knowledge (CK) and Pedagogical Content Knowledge (PCK). Defining these concepts in accessible.

Thank you for this feedback. We have expanded the "Teachers' Knowledge" section to provide more comprehensive and accessible definitions of CK and PCK. we added clear, accessible definitions with literacy-specific examples. in addition, we incorporated recent empirical evidence from Ball et al. (2008), McCutchen et al. (2002), and Piasta et al. (2009). Furtheremore, connected CK/PCK distinctions to professional development design and effectiveness was addeded.

Reviewer's comment: strengthening connections with existing studies through a comparative analysis of the current and previous PD models will situate this research within the broader literature. Using simple analogies can help beginner researchers grasp these relationships.

Thank you for this suggestion. We have added a new subsection "Comparison with Existing Professional Development Models" that compares our model with established frameworks (Desimone, 2009; Tschannen-Moran & McMaster, 2009), uses accessible analogies (three-legged stool, bridge structure) to clarify relationships between PD models and positions our research within the broader literature while highlighting unique contributions.

Reviewer's comment: Clarification is needed on the participant selection process and how the PD sessions were implemented. A detailed explanation of how teachers were categorised into small and large application groups will improve transparency.

Thank you for highlighting the need for further clarification regarding participant grouping
and PD implementation. We have revised the manuscript to clarify that teachers were not
assigned to implementation groups in advance . All participants received the
same PD structure and content. In the post-test given after the PD, teachers self-
reported the number of lessons they had implemented in class. These implementation
levels varied naturally due to personal factors and school-related constraints, and were
not influenced by any research team intervention. To examine differences in outcomes
based on implementation, we conducted a post hoc grouping of participants according to
the median number of lessons taught (Mdn = 4). This resulted in two groups: a small-
implementation group (0–4 lessons) and a high-implementation group (5–12 lessons).
This clarification has been added to the Methods section to improve transparency
regarding group categorization.

Reviewer's comment: An elaboration on the statistical techniques used, such as t-tests and ANOVA analyses, is essential. Describing what these tests reveal in layman’s terms (e.g., their role in determining the significance of changes in self-efficacy and knowledge) will assist less experienced readers in understanding the analysis.

Thank you for this suggestion. We have added a brief explanation of our statistical techniques (t-tests and ANOVA) in layman's terms, describing what these tests reveal about significance of changes in knowledge and self-efficacy. This makes our analysis more accessible to readers with varying statistical expertise.

Reviewer's comment: A more detailed discussion of the interaction effects, with relatable examples demonstrating how increased teaching frequency correlates with improved self-efficacy, would enhance clarity.

Thank you for this suggestion. We have expanded the interaction effects discussion in the Results section by: Adding specific numerical examples (low implementation: 5.56→5.60; high implementation: 5.33→5.83), explaining the 'dose-response' relationship between teaching frequency and self-efficacy gains and including concrete teacher comparisons to illustrate how implementation level affects confidence.

Reviewer's comment: The discussion section would benefit from succinct summaries of key findings, linking them back to the literature review. Providing practical examples of how improved literacy instruction affects student outcomes can increase the relevance of the findings. Concluding with specific recommendations for practitioners and scholars and suggestions for future research, such as exploring varying dosage levels of PD training, would greatly enrich the paper.

Thank you for this valuable feedback. We have enhanced the discussion section by:

Adding succinct summaries at the beginning of each research question discussion to clearly highlight key findings
Including a new section on practical implications that provides concrete examples of how improved literacy instruction affects student outcomes
Adding comprehensive recommendations for practitioners, researchers, and policy makers
Suggesting

Reviewer's comment: Consider simplifying technical terminology or providing brief definitions upon first mention. Enhancing readability through bullet-point summaries or subheadings within major sections will create a more precise roadmap for readers, improving overall engagement with the content.

Thank you for this helpful suggestion about improving readability. We have enhanced the manuscript by: adding brief, accessible definitions for key terms (CK, PCK, ANOVA) upon first mention and simplifying statistical language while maintaining precision. In addition, we addedd subheadings within major sections.

Author Response File: Author Response.pdf

Reviewer 4 Report

Comments and Suggestions for Authors

Thank you for submitting your paper. I really enjoyed reading this and have some thoughts on how it could be improved so that it really reaches the maximum audience and is cited in future work.

The main advice is that you need to give some context about the Israeli education system and workforce – otherwise your findings are difficult for scholars in other countries to get a sense of. There is more detail below about why this is the case.
You are sometimes drawing on really old research references – there is a lot of more recent work you could be drawing upon for Teacher PD, which would strengthen your work.
Your discussion needs to acknowledge some of the limitations of your study.

Feedback

The first paragraph reads rather strangely – is this a global issue? If so, can you share some data? If you are focusing on Israel, can you make that clear here?

This section is hard on the reader and the meaning is not clear:

As with many countries worldwide, also classes in the country where the study took place, there are currently characterized by heterogeneous populations with different needs; despite the efforts of the education system, most teachers have not received sufficient PD in the literacy field.

When you introduce CK and PCK, the first thing to tell us is what they stand for – content knowledge and pedagogical content knowledge, then tell us what those mean.

Reference to the Berman study for RAND – this is really old now – I think to earn a place in a current paper you need to say why this is still relevant.

Again, Henson and Ross references – there are much more up to date studies you could cite here to support these – Ross is 30 years old now.

I would like to read about the education system in Israel relevant to this study, teacher PD for example, and the demographics of the teaching population, as I am not clear about how this research may be relevant to other contexts, or how your data reflects the population. For example:

Measures section – 80 out of 82 participants were female – is this reflective of the teaching population in Israel?

Teachers were from Grade 2 and 3 – what does this mean – age of children, sense of the literacy curriculum for these grades?

In your diagrams and charts, sometimes the time goes left to right, sometimes right to left – could these be standardised please?

Author Response

*Revision marked in pink

Reviewer's comment: he main advice is that you need to give some context about the Israeli education system and workforce – otherwise your findings are difficult for scholars in other countries to get a sense of.

Thank you for your suggestion to elaborate on the Israeli school context regarding professional development. We have added a comprehensive section that addresses the specific gaps in preservice teacher training and professional development requirements in Israel.

Reviewer's comment: You are sometimes drawing on really old research references – there is a lot of more recent work you could be drawing upon for Teacher PD, which would strengthen your work.

Thank you for your suggestion to update the literature review. We have substantially strengthened our manuscript by incorporating recent meta-analytic evidence and systematic reviews. Specifically, we added Rice et al.'s (2024) comprehensive meta-analysis of professional development in reading comprehension, Zhou et al.'s (2023) meta-analysis on STEM teacher self-efficacy, Yoon and Goddard's (2023) large-scale international study of 97,729 teachers across 45 countries demonstrating self-efficacy as a mediator between PD quality and instructional effectiveness, and Larsen and Bradbury's (2024) systematic review on teacher self-efficacy and professional development. These recent publications not only support our theoretical framework but also highlight gaps our study addresses, particularly regarding implementation-level effects on self-efficacy in literacy contexts.

Reviewer's comment: The first paragraph reads rather strangely – is this a global issue? If so, can you share some data? If you are focusing on Israel, can you make that clear here?

Thank you for your important feedback regarding the clarity of the opening paragraph. We have revised the introduction to address your concern by clearly establishing the scope and context of the literacy achievement issue. The revised opening now begins by acknowledging that improving literacy achievement is an ongoing challenge for educational systems worldwide, supported by international assessment data from PIRLS that demonstrates variations in student reading performance across countries. We then establish the broad consensus in literature regarding teacher quality as a critical school-related factor before specifically positioning our study within the Israeli context, where recent international assessments have shown concerning trends in literacy achievement.

Reviewer's comment: When you introduce CK and PCK, the first thing to tell us is what they stand for – content knowledge and pedagogical content knowledge, then tell us what those mean.

Thank you for your suggestion to improve the clarity of our CK and PCK terminology. We have addressed this by revising the introduction of these key concepts to first provide the full terms before using abbreviations. The revised text now reads: "

Reviewer's comment: Reference to the Berman study for RAND – this is really old now – I think to earn a place in a current paper you need to say why this is still relevant.

hank you for your observation regarding the Berman et al. (1977) RAND study. We have addressed this by adding contextual framing that positions this as seminal research whose core finding—that teacher self-efficacy is the most significant variable for teacher change—has been consistently validated across nearly five decades. We now connect this foundational insight to recent meta-analytic evidence, including Zhou et al.'s (2023) findings on PD and Yoon and Goddard's (2023) international study of 97,729 teachers confirming self-efficacy as a mediator between PD quality and effectiveness.

Reviewer's comment: Henson and Ross references – there are much more up to date studies you could cite here to support these – Ross is 30 years old now.

Thank you for pointing out the need to update the Henson (2001) and Ross (1994) references with more contemporary evidence. We hw your data reflects the population.

Thank you for your request for more information about the Israeli education system. We have addressed this by adding a comprehensive section titled The Israeli Educational Context.

we have addressed this by incorporating recent systematic reviews and meta-analyses that provide stronger, current support for these claims.

Reviewer's comment: I would like to read about the education system in Israel relevant to this study, teacher PD for example, and the demographics of the teaching population, as I am not clear about how this research may be relevant to other contexts, or ho

Reviewer's comment: Measures section – 80 out of 82 participants were female – is this reflective of the teaching population in Israel?

Reviewer's comment: Teachers were from Grade 2 and 3 – what does this mean – age of children, sense of the literacy curriculum for these grades?

Reviewer's comment: In your diagrams and charts, sometimes the time goes left to right, sometimes right to left – could these be standardised please?

Thank you for your attention to the consistency of our visual presentations. We have addressed this by standardizing all diagrams and charts to follow a consistent left-to-right temporal progression.

Author Response File: Author Response.pdf

Round 2

Reviewer 2 Report

Comments and Suggestions for Authors

While the authors state that they conducted an a priori power analysis using parameters appropriate for their main statistical procedures (paired-sample t-tests and 2×2 ANOVA with repeated measures, with effect size f = 0.25, α = .05, and power = .95), the figures reported are misleading. In reality, power analyses for the detection of interaction effects in a 2×2 repeated measures ANOVA with such parameters require a considerably larger sample, typically in the range of 120–210 participants, depending on the assumed correlation between repeated measures. The cited threshold of 54 participants is only adequate for the simplest paired-samples t-test or main effects in a between-subjects ANOVA, not for the more demanding interaction effects that are central to their design. With a sample size of 82, the study is underpowered for the intended analyses, especially at the high level of statistical power claimed. Therefore, the claim that the study is “adequately powered” is not substantiated, which undermines the reliability of the findings.

The revised manuscript continues to focus almost exclusively on reliability coefficients (e.g., Cronbach’s alpha) when presenting evidence for the measurement tools, while offering only superficial references to construct validity (e.g., referencing prior work or the process of translation and expert review). There is still a lack of substantive validity evidence, such as the presentation of results from factor analyses in the current sample, demonstration of measurement invariance, or criterion-related validity checks. Given the centrality of these instruments to the research, this omission is critical and compromises the study’s scientific rigor.

Although the authors now acknowledge the lack of normality in most of their key variables, they justify the use of parametric methods by citing literature that these are robust in “large” samples. However, the current sample size (N = 82) cannot reasonably be characterized as large enough to override substantial departures from normality, especially in the context of non-random sampling and skewed group sizes. This decision further calls into question the appropriateness and interpretability of the reported results.

Despite explicit recommendations, the authors did not integrate relevant and current international literature in their theoretical framework and discussion. For instance, I specifically suggested the integration of “Drama-based methodologies and teachers’ self-efficacy in reading instruction” (https://doi.org/10.1080/03323315.2025.2479438), which experimentally examines the impact of experiential and drama-based approaches on teacher self-efficacy in reading instruction. The study offers comparative evidence on alternative methodologies and would have provided valuable context and triangulation for the present research. This literature was not incorporated, nor was the argument strengthened in the discussion, as recommended.

Tables 2 and 3 were required to be revised to fully comply with APA guidelines, particularly in terms of clarity, structure, and presentation of variables and statistical outcomes. Despite the authors’ claim that this has been addressed, Table 2 (and, to a lesser extent, Table 3) remains unchanged and does not meet APA standards. This undermines the manuscript’s presentation and makes it difficult for readers to interpret the results.

Given these persistent and substantial shortcomings (particularly the inadequate power analysis, insufficient validity evidence for research instruments, questionable analytic choices, and failure to address the most essential feedback) I must recommend rejection of the manuscript in its current form. The scientific rigor, clarity of reporting, and responsiveness to review expectations are not at a level that would support publication in an international, peer-reviewed journal.

Should the authors wish to pursue publication in the future, I strongly advise that they address the aforementioned concerns in a fundamentally revised and methodologically strengthened manuscript.

Author Response

Dear Editor,

We are pleased to resubmit our revised manuscript titled "Transforming Teacher Knowledge to Practice: Exploring the Impact of a Professional Development Model on Teachers' Literacy Instruction and Self-Efficacy".

We appreciate the constructive feedback provided by the reviewer, which has strengthened our manuscript. The review process has enabled us to enhance both the methodological rigor and the clarity of our findings, particularly regarding the dose-response relationship between professional development implementation and teacher self-efficacy.

Please find below our detailed point-by-point responses to each reviewer's comments, along with explanations of the corresponding changes made to the manuscript (Main changes are marked in green).

1. While the authors state that they conducted an a priori power analysis using parameters appropriate for their main statistical procedures (paired-sample t-tests and 2×2 ANOVA with repeated measures, with effect size f = 0.25, α = .05, and power = .95), the figures reported are misleading. In reality, power analyses for the detection of interaction effects in a 2×2 repeated measures ANOVA with such parameters require a considerably larger sample, typically in the range of 120–210 participants, depending on the assumed correlation between repeated measures. The cited threshold of 54 participants is only adequate for the simplest paired-samples t-test or main effects in a between-subjects ANOVA, not for the more demanding interaction effects that are central to their design. With a sample size of 82, the study is underpowered for the intended analyses, especially at the high level of statistical power claimed. Therefore, the claim that the study is “adequately powered” is not substantiated, which undermines the reliability of the findings.

Our response: We sincerely thank the reviewer for the crucial methodological critique regarding power analysis for interaction effects in repeated measures ANOVA. The reviewer is absolutely correct that our sample size of N=82 is inadequate for reliably detecting interaction effects with 95% power. Rather than making questionable statistical assumptions to justify our original approach, we have comprehensively revised our analytical strategy for Research Question 3 and updated the manuscript accordingly. We now employ an independent samples t-test comparing self-efficacy change scores between implementation groups. This approach tests the identical research question (dose-response relationship between implementation frequency and self-efficacy gains) while providing adequate statistical power with our sample size and yielding stronger, more interpretable results. Key Methodological Advantages: (1) Change scores (post-PD minus pre-PD self-efficacy) directly quantify improvement magnitude for each teacher; (2) Implementation groups showed no significant baseline differences (t(80) = 1.10, p = 0.275), validating group comparisons; (3) The analysis yields a large, robust effect (t(80) = 2.95, p = 0.004, Cohen's d = 0.80); (4) Results reveal a clear threshold effect—teachers implementing 5+ lessons showed substantial gains (M = 0.50) while those implementing fewer lessons showed minimal improvement (M = 0.04). All changes are highlighted in green throughout the revised manuscript, including: (1) Methods: Revised statistical analysis and power calculations for independent t-test approach; (2) Results: Complete revision of Research Question 3 with new change score analysis; (3) Table 4 & Figure 5: Updated to present change scores and threshold effect; (4) Discussion: Refocused on 5-lesson threshold implications; (5) Abstract & Practical Implications: Updated to reflect new findings.

2. The revised manuscript continues to focus almost exclusively on reliability coefficients (e.g., Cronbach’s alpha) when presenting evidence for the measurement tools, while offering only superficial references to construct validity (e.g., referencing prior work or the process of translation and expert review). There is still a lack of substantive validity evidence, such as the presentation of results from factor analyses in the current sample, demonstration of measurement invariance, or criterion-related validity checks. Given the centrality of these instruments to the research, this omission is critical and compromises the study’s scientific rigor.

Our response: We thank the reviewer for this important comment and have addressed it by incorporating evidence of construct validity through factor analyses (EFA) conducted on the current sample for all major self-report measures included in the study. Specifically: For the Teacher Self-Efficacy in Literacy Instruction (TSELI) scale (translated and adapted by Lipka, 2017), a factor analysis (principal component analysis with varimax rotation) was performed. The results supported a two-factor structure, consistent with the original theoretical model, reflecting teachers’ self-efficacy in oral reading and writing instruction. This analysis is now described in the Measures section and key results have been added to establish construct validity in the Hebrew adaptation. Similarly, factor analyses were performed on the teachers' knowledge questionnaires (before and after the professional development program scores). All analyses supported a single-factor structure, indicating that the items coherently measured a unified construct of literacy-related pedagogical knowledge. Summaries of these results have also been integrated into the measures section. These additions supplement the previously reported reliability estimates. We hope this addresses the concern regarding the depth of validity evidence. As this is a field-based study conducted with the entire teacher population in the district, measurement invariance testing across subgroups was not applicable. However, future studies with larger and more diverse samples may expand on this work with confirmatory factor analysis and invariance testing.

3. Although the authors now acknowledge the lack of normality in most of their key variables, they justify the use of parametric methods by citing literature that these are robust in “large” samples. However, the current sample size (N = 82) cannot reasonably be characterized as large enough to override substantial departures from normality, especially in the context of non-random sampling and skewed group sizes. This decision further calls into question the appropriateness and interpretability of the reported results.

Our response: Thank you for this important observation. We agree that normality violations should be addressed with caution, particularly when using parametric analyses. In our revised manuscript, we have acknowledged the deviations from normality found in several key variables, and discussed this as a methodological limitation in the Limitations section. While our sample size (N = 82) may not be considered large in absolute terms, it represents the entire population of teachers participating in the PD program in this educational district. Therefore, this was not a sample drawn through random selection, but a complete population within the research context, and could not have been expanded. We have added a clarification in the Participants section noting that the sample was not randomly selected, as all eligible teachers in the district were included in the study. Furthermore, the study was conducted in a naturalistic field setting, where conditions do not always allow for perfect randomization or ideal sample distributions. Given these constraints, we followed prior empirical and methodological literature suggesting that parametric tests, including t-tests and repeated measures ANOVA, are robust to moderate violations of normality when sample size exceeds 30 participants (e.g., Blanca et al., 2017; Schmider et al., 2010). Nonetheless, we acknowledge the limitations this imposes and encourage further studies to replicate the findings using larger and more diverse samples. Nonetheless, in response to the reviewer’s concerns, we have added a limitation paragraph acknowledging the deviations from normality, skewness and kurtosis values, and their potential impact on generalizability. This is in line with our commitment to transparency and scientific rigor.

4. Despite explicit recommendations, the authors did not integrate relevant and current international literature in their theoretical framework and discussion. For instance, I specifically suggested the integration of “Drama-based methodologies and teachers’ self-efficacy in reading instruction” (https://doi.org/10.1080/03323315.2025.2479438), which experimentally examines the impact of experiential and drama-based approaches on teacher self-efficacy in reading instruction. The study offers comparative evidence on alternative methodologies and would have provided valuable context and triangulation for the present research. This literature was not incorporated, nor was the argument strengthened in the discussion, as recommended.

Our response: We appreciate the reviewer's specific recommendation to integrate Mastrothanasis and Kladaki's (2025) experimental study on drama-based methodologies and teacher self-efficacy. Following this guidance, we have incorporated their research throughout our manuscript in four key locations: (1) the literature review, where we integrated their controlled study of 204 Greek teachers showing large effect sizes (d = 1.21-1.15) for experiential approaches; (2) our discussion of self-efficacy findings, where their results provide convergent international evidence for implementation-focused professional development effectiveness; (3) our analysis of implementation effects, where their 8-week sustained implementation findings cross-validate our 5-lesson threshold discovery; and (4) our practical implications, where we cite their cross-cultural evidence that hands-on PD principles transcend specific educational contexts. This integration strengthens our theoretical framework by demonstrating that large self-efficacy improvements occur consistently across different PD modalities (literacy-specific vs. drama-based) and international contexts (Israeli vs. Greek), supporting our conclusion that experiential, implementation-focused approaches are universally critical for meaningful teacher confidence development. All additions are highlighted in green throughout the revised manuscript.

5. Tables 2 and 3 were required to be revised to fully comply with APA guidelines, particularly in terms of clarity, structure, and presentation of variables and statistical outcomes. Despite the authors’ claim that this has been addressed, Table 2 (and, to a lesser extent, Table 3) remains unchanged and does not meet APA standards. This undermines the manuscript’s presentation and makes it difficult for readers to interpret the results.

Our response: We acknowledge that Table 2 required comprehensive reformatting to meet APA 7th edition standards. The original table had structural issues with column organization, inconsistent statistical reporting, and unclear variable presentation. We have completely revised the table with proper APA formatting, clear variable labels, standardized statistical reporting, and comprehensive notes explaining the non-significant group differences.

We revised Table 3 with proper APA formatting: correct p-value reporting (< .001), clear variable names, organized structure, and comprehensive notes. We report t-values and significance levels which adequately demonstrate the magnitude of knowledge improvements.

The revisions have resulted in a more robust empirical contribution that advances understanding of effective professional development design in literacy education. Our findings provide practical guidance for educators and policymakers by identifying specific implementation thresholds necessary for professional development success.

We believe these revisions have addressed all reviewer concerns while maintaining the study's core contribution to the professional development literature. We look forward to your consideration of our revised manuscript.

Sincerely,

Authors

Author Response File: Author Response.pdf

Article Menu

Transforming Teacher Knowledge to Practice: Exploring the Impact of a Professional Development Model on Teachers’ Literacy Instruction and Self-Efficacy

Further Information

Guidelines

MDPI Initiatives

Follow MDPI