Meta-Analysis for Math Teachers’ Professional Development and Students’ Achievement
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsReview of the paper Meta-Analysis for Math Teachers’ Professional Development 2 and Students’ Achievement
The paper presents a literature review of studies on how mathematics teachers’ participation in professional development (PD) affects students’ achievement. Given the need to improve students’ mathematical competence and the efforts to do so through various professional development projects, the study raises an important question. The reviewed papers are quantitative studies of PD that include outcomes of students’ achievements, and the meta-analysis in the paper is quantitative. I must admit that I am not very familiar with quantitative methods (I realized that the paper concerns that too late, after agreeing to review it), and I hope that other reviewers can contribute more in that area. My comments are related to other aspects.The paper is well-written and concerns an important question, but there are things that can be improved, as I see it.
The introduction needs to be better structured.
- There are some repetitions (e.g., most of 67-75?) and the flow can be improved (e.g., in lines 59-60, the study’s aim is stated, but how is it related to 61-66?)
- In 1.2, line 85-92, seven moderators are introduced. It is not clear where they come from, what they are about (e.g., professional development category of the program) and why exactly them. This is explained much later in the paper, but the reader need to know more already here.
- I struggle with the subtitles in the introduction and the text belonging to each. For example, why does the text 118-131 belong to title 1.3.1 and not 1.3.2? It is about topics in PD. What is actually the difference between the titles 1.3.3 and 1.3.4? The authors need to either restructure or help the readers to understand.
- I feel that there is much repetition in the introduction, and, as a reader, I still feel unsure at the end about what is known from before (particularly, what kind of literature reviews have been conducted before and what they have revealed) and what this study is contributing with, how it is similar or different from previous literature reviews. I think that this is really important to improve in the paper.
- Clear research questions need to be stated in the introduction to make the paper more readable and coherent.
Method:
- Again, the seven components of PD are presented (215-222) without elaborating on what they are, where they come from, and why.
- The literature review includes only USA+Canada context. Why? Some words about how it can be relevant for others? (Minor: “non-USA studies” in line 256 – studies from Canada are to be included?)
- Table 1 is difficult to read with all the abbreviations, maybe some can be avoided? Also, it is first here it comes apparent that for example “PD Teaching Approach” is about content only, pedagogy only and content and pedagogy. But isn´t in more about the content (what is worked on) in the PD, rather than teaching approaches? Also “Pd category” is not clear for me, what it is about.
Results
- Subsections are structured according to the research questions, but the research questions are clearly stated later, in the discussion. They need to be introduced in earlier, as I have mentioned before.
- In 3.3, the different codes of the seven moderators are finally introduced. It needs to come earlier, and in more detail.
- One of the moderators is “study design”. While the other components/moderators are characteristics of the PD program, this one is about methodology in the study examining it. For me, it seems strange to consider study design as a PD-component (in RQ2) on the same level as the other characteristics of the given PD program. I suggest to take it out and rather discuss as methodological finding?
Discussion
- As in the introduction, also here it would be good if the authors could be clearer about what the contribution of the study is.
- In 4.2 a mismatch between this study and previous studies is pointed out. What can be the reasons for the mismatch? What is different in this study compared to the others? It would be interesting if the authors could say something about that.
- The section seems to be organized in accordance to the RSs, but then there are subtitles 4.8 and 4.9 which are actually codes of “PD category” (4.7)? Also, I have to admit that I still do not know what the codes in “PD Category” are standing for. Is that some categorization of PD which is known for people in USA/Canada? I think that there is a need to say more about the codes. (and it should be done earler, in the Methods)
Author Response
Please see the attachment
Author Response File:
Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsThis paper is a meta-analysis of the effects of various characteristics of Professional Development (PD) on students’ achievement in mathematics. That is, the study collects results from previous studies where students of teachers who completed professional development in mathematics were tested before and after their teachers’ PD completion, to determine whether the students’ achievement increased. The meta-study includes an overall effect of PD in mathematics, as well as effects of several characteristics, some of which were found to have significant impact on students’ achievement. As the authors argue, such meta-studies are rare in mathematics education research, and usually, studies on PD focus on teachers’ development or qualitative aspects of improvement. Thus, the study is novel and most welcome, despite the fact that the data pool is limited.
This is a systematic meta-analysis using quantitative methods, of which I am not familiar (never having conducted quantitative studies myself). As a consequence, I am not able to assess the quality of the statistical tools and calculations that were utilized in the study. Further, some of my questions and comments might be caused by my unfamiliarity with statistical methods, for which I apologize. However, upon reading, I felt quite convinced that the authors know what they are doing. The references to the PRISMA protocols (details included as supplementary materials) and the careful elaboration of methods, the tests to ensure no bias (3.2) and the way the results were written with both numbers/statistics and explanations/interpretations convinced me that this is scientifically sound (although I would be happy to hear what a more competent reviewer would say). The manuscript was, in general, easy to read.
I have one overall comment. Throughout the manuscript, I was confused by the inclusion of “study design” as a moderator in the analysis. It turns out that studies conducted under a randomized study design scored higher on students’ achievement. This is an interesting observation, but this moderator is not a feature of the PD program. Or is there something I don’t get here? I was expecting the results of this meta-study to be useful in the design of new PD models and programs, but how the students are tested does not seem very relevant to this end. However, in the study, this variable is treated as a feature of the PD. If there is something I don’t get here, please help me understand.
Moreover, I have some comments regarding each part of the paper:
Abstract: A bit too long (should be 200 words or less), and the headings were disturbing (the guidelines says abstracts should be without headings).
Introduction: p. 2, line 63: The authors claim that the study will “evaluate teachers’ content knowledge, pedagogical content knowledge (…)”, which I can not see that it does.
The first paragraph of 1.3.1 is repetition and not necessary.
Method: p. 5, line 223: I got the impression that you aimed to collect 30 studies (why?), later I realized this is just the number you were left with after an extensive process (which made sense). Maybe reformulate here? Otherwise, I found the data collection to be very well described. I did, however, wonder about some of the categories you use, which you don’t account so well for. In particular, how did you create the various “PD Category” codes? This categorization affects your results, so it would be useful to read more about it (it is not entirely self-explanatory). Same goes for math type (of course, you could choose to divide all maths teaching into geometry/algebra/other, but you could also make different categorizations, in particular in the lower grades).
Results: p. 25-26, lines 430-434: I think you should save discussing the causes of the results for the Discussion.
Discussion: Generally, I think the authors are good at discussing the possible causes and consequences of the results presented in the previous section. However, at some points, I wonder about the correlation/causality question, and consequently about the implications of the results. I had, initially, no problems accepting that some types of PD were “better” than others, e.g., that PD including both content and pedagogy is significantly more effective than pedagogy only PD. This made sense to me. But the idea that PD is more effective in the middle grades, and more effective for geometry, sounds a bit strange in my ears. Surely we want to build good PD programs for all grades and mathematical topics? I suspect that the results reflect some other good qualities of the PD programs with geometry and middle grades … and that the results did not come from geometry itself (the data material is also very small, often with less than five studies with each characteristic). The authors write about the learning opportunities inherent in geometry, but I suppose you could say similar things about algebra. In the end, I am left to think, what exactly are you testing here, and what does it mean?
A related question, which is clearly outside my limited statistic knowledge, but I’ll raise it anyway, is how easy it is to separate the various factors (moderators) in one study. The authors carefully explain how one study (paper) often reports on several “interventions”, leading to measurement of more than one “effect”. How is this done? How do you know that the success of a PD program is due to its treatment of geometry, when it could just as well be the duration or the content focus (content/pedagogy) that matters? Including more variables and drawing conclusions that are against my common sense reduces, in my opinion, the trustworthiness of the study. (But again, I am no expert on the methodology here, so I might be wrong about this.)
The two headlines 4.8 and 4.9 can be removed, the text really belongs to 4.7 about PD category. (When reading 4.7, I was again uncertain about the categories; what are they, and how did they emerge?)
I hope the authors found my comments useful, and I wish you the best of luck in further preparation of the manuscript.
Author Response
Please see the attachment
Author Response File:
Author Response.pdf
Reviewer 3 Report
Comments and Suggestions for AuthorsThis manuscript addresses an important and long-standing issue in the world of mathematics professional development. While the author(s) make a reasonable argument for examining their topic, the backing for their argument is based on publications that are overwhelming one to three decades old. It seems that there are more current published works that might be included as part of the justification while continuing to include the seminal works already provided. The literature review has a feel that it was written in 2015. A clear example of this is in the paragraph that starts on line 193 where a discussion of “…the past 13 years…” is supported with citations that the most recent of which is from 2011.
On line 209, it is unclear what is meant by, “Our manuscript was drafted using….” The checklist is appropriate, but it is unclear what is meant by drafting the manuscript.
It would be so helpful if the information provided in line 225 could be included in the abstract. The connection between 30 studies and 164 effect sizes was very confusing at first.
The results seem skewed to specific study qualities (like short duration) as a potential result of the inclusion of duplicate study contexts as different effect sizes. This seems to be a huge issue in an analysis that sets out to compare study qualities and one that cannot be overcome with this study design.
In the discussion, when the authors refer to “studies” it is unclear (but probably likely) that they are referring to the same study context and write up but parsing it out as though it were different studies. This is misleading and, in my opinion, delivers a fatal flaw to this manuscript.
Author Response
Please see the attachment
Author Response File:
Author Response.pdf
Round 2
Reviewer 1 Report
Comments and Suggestions for AuthorsThank you for the opportunity to review the revised manuscript. It has been clearly improved. Some issues that still can be better, as I see it:
Descriptions of moderators on page 2:
- I still find it challenging to understand how «content only, pedagogy only, or a combination of content and pedagogy» is about PD teaching approach (and not about topics taught in the PD as described in 1.3.2). Can this be written more clearly, what PD teaching approach is about?
- I still struggle with understanding what PD Category can be. It is “describing the thematic area or framework around which the PD was organized”, and examples of such an area or framework seem to be: “professional learning communities, formative assessment, curriculum, online/video-based, reform-initiated math, PD approach”. I understand the others, I think, but the last one, PD approach, is somehow everything aiming to improve? So, it includes all the other examples? Or is that PDs not having a clear thematic area/framework? Can this be written more clearly?
- For each moderator, it is the codes (used later in the analysis) given in the parentheses? Are they pre-defined (based on what – need to be elaborated on), or developed through the analysis? If the latter, maybe they need to be omitted in the introduction, but rather presented in the analysis section?
There is some mismatch between the notions used for moderators in section 1, and notions used in research question 2 on page 5, it should be aligned. (I have to say that I find the notions in the research question more understandable.)
Can it be indicated more explicitly in section 1.3.3, what this study will contribute to compared to the earlier studies mentioned?
The manuscript needs to be edited. E.g., the notion «PD category» is used in 1.1 without definition; the abbreviation PD is used in 1.1, defined in 1.2 (and then again in 1.3). There are still some lacks in the text flow: e.g., second paragraph (38-41) is not clearly related to the paragraphs above and below.
Author Response
please see the attachment
Author Response File:
Author Response.pdf
