Next Article in Journal
Multiple Fractional Solutions for Magnetic Bio-Nanofluid Using Oldroyd-B Model in a Porous Medium with Ramped Wall Heating and Variable Velocity
Next Article in Special Issue
A Comparison of Deep Learning Methods for ICD Coding of Clinical Records
Previous Article in Journal
Mechanical Design and Gait Optimization of Hydraulic Hexapod Robot Based on Energy Conservation
Previous Article in Special Issue
Castration-Resistant Prostate Cancer Outcome Prediction Using Phased Long Short-Term Memory with Irregularly Sampled Serial Data
 
 
Commentary
Peer-Review Record

Ten Points for High-Quality Statistical Reporting and Data Presentation

Appl. Sci. 2020, 10(11), 3885; https://doi.org/10.3390/app10113885
by Pentti Nieminen
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Appl. Sci. 2020, 10(11), 3885; https://doi.org/10.3390/app10113885
Submission received: 5 May 2020 / Revised: 26 May 2020 / Accepted: 29 May 2020 / Published: 3 June 2020
(This article belongs to the Special Issue Medical Informatics and Data Analysis)

Round 1

Reviewer 1 Report

Review of paper “Ten points for high-quality statistical reporting and data presentation”.

 

This paper proposes a checklist-type instrument to evaluate the quality of statistical reporting in medical journals and from published guidelines for reporting and data presentation. It addresses an important issue that exist not only in medical research, but also many other domains which use statistical data for research.

 

My general comments are:

  1. What is the reason for the statistical reporting? Knowing it well could help identify the key points. One argument that I have is that one could have detailed description of the data acquisition, processing steps and analysis and adequate ancillary information, as well as beautiful figures and tables. However, he/she choose to cherry pick the data to make the graph prettier and match the conclusion.
  2. Although those ten points are used to verify against 160 papers, I am still not convinced that this is a procedure-oriented guideline that is effective enough in the reviewing process. Most of the check points sound common-sense as a data scientist. We would need more domain specific and less subjective criteria within the peer review process.

 

The proposed check lists are more like the quality check list of the data prior to submitting the paper. It is not solid enough to be a general guidance for editors and reviewers. They should focus more on the scientific value of the paper based on good quality data information and representation. If those check points are not met, paper should not even be submitted, or will be rejected by reviewers/editors in the first place. Therefore, I don’t think the scope and quality of the paper is a good fit for MDPI.

Author Response

Response to Reviewer #1

I thank the reviewer for the feedback of our manuscript “Ten points for high-quality statistical reporting and data presentation".

 

This paper proposes a checklist-type instrument to evaluate the quality of statistical reporting in medical journals and from published guidelines for reporting and data presentation. It addresses an important issue that exist not only in medical research, but also many other domains which use statistical data for research.

I agree with the reviewer that these issues are not only pertinent to medical articles.

 

What is the reason for the statistical reporting? Knowing it well could help identify the key points. One argument that I have is that one could have detailed description of the data acquisition, processing steps and analysis and adequate ancillary information, as well as beautiful figures and tables. However, he/she choose to cherry pick the data to make the graph prettier and match the conclusion.

Unfortunately, too many medical articles do not provide a sufficiently clear, accurate, or complete account of what was done and what was found. The purpose of my study was to help editors, reviewers and authors to improve the quality of reporting of research. I have added the following text to the Discussion section to clarify this:

“To clinicians and scientists, the literature is an important means for acquiring new information to guide health care research and clinical decision making. Poor reporting is unethical and may have serious consequences for clinical practice, future research, policy making, patient care and ultimately for patients [10]. Securing the quality of the publications is an important activity of journals in their editorial policy. Journals need to be more proactive in providing information about the quality of what journals publish [10,46]. Journals should consider strategies and actions to verify that authors realize full responsibility for the statistical reporting quality of their manuscripts. Use of the short checklist proposed in this paper with reporting guidelines would be an important step towards quality-secured research.”

 

Although those ten points are used to verify against 160 papers, I am still not convinced that this is a procedure-oriented guideline that is effective enough in the reviewing process. Most of the check points sound common-sense as a data scientist. We would need more domain specific and less subjective criteria within the peer review process.

I agree that to some researchers, these items sound common-sense. However, in my experience, this is not obvious to all biomedical or health science researchers. Consulting biostatisticians face this problem frequently. I think we still need initial statistical reviews.  

 

The proposed check lists are more like the quality check list of the data prior to submitting the paper. It is not solid enough to be a general guidance for editors and reviewers. They should focus more on the scientific value of the paper based on good quality data information and representation. If those check points are not met, paper should not even be submitted, or will be rejected by reviewers/editors in the first place. Therefore, I don’t think the scope and quality of the paper is a good fit for MDPI.

Several findings have demonstrated that a noteworthy percentage of articles, even those published in high-prestige journals, have flaws in their statistical reporting and data presentation. Thus, a simplified checklist could be handy for editors and reviewers to spot the issues that might indicate more serious problems in the reporting of scientific articles.. I have already noted in the discussion section that “If the reviewer cannot find the basic information and description related to the data analysis, the reviewer does not need to read the whole article. After checking tables and figures and reading through the statistical analysis subsection in the methods section, the reviewer can reject the manuscript on good grounds.”

 

Reviewer 2 Report

I would like to thank the author for a great initiative and a well conducted study with a well-defined and thought through design and valuable end result. It is clearly needed ways to improve the review process, which is also apparent from a previous own manuscript that has been cited in the manuscript. The checklist is certainly good enough for adding valuable help in the review process already now. I do hope though that this could kickstart a project to have an agreed list of items, and as well criteria to consider them as fulfilled, between a larger group of scientists, suggestedly with the author in charge, to get it accepted by the scientific community in the same way as guidelines such as PRISMA, STROBE and RECORD.

I very much liked item 8. It is remarkable how poor authors are at giving references to methods. This needs to be improved. I liked the other items but I just want to mention this specifically as I rarely seen this highlighted in the literature, while others have commonly been brought up (not that it has helped enough for raising the quality though sadly).

Even if I am very positive to the manuscript I have some things I would like to bring up. Most of them only thoughts that could potentially improve the manuscript that the author himself can decide to handle or not.

Revision required:
1) As you have defined 20 draft questions for the protocol, you should therefore clarify which ones these are as that certainly adds a value if someone wants to repeat your task of developing a protocol or even suggest refinements of it. That you started with 20 draft questions is a strength of the study.

2) A minor thing is that I think that "fulfill the underlying assumptions" as written for item 7 is a too strong wording. It can never be proven that assumptions are fulfilled so it is rather that they are satisfactory fulfilled. This small thing might be important to be clearer about if the checklist should be possible to use to a satisfactory level. It is a somewhat subjective judgement but that is unavoidable as is most of the items in the checklist.

3) 40 references were chosen in each journal. Please be specific about the selection critieria. Was it random? Was it the first 40 spotted? Or how did you choose them?

4) There is no table listing the results in relation to the total of 0-10. I would at least expect this to be done according the suggested five categories in section 2.2. The manuscript is mainly related to testing how well the new suggested instrument works. Still, I think that the actual research that have been conducted deserves to be better presented and reasoned around. It is actually high rates of manuscript that are not fulfilling the criteria and it would be very interesting and valuable for the field to know the proportion of manuscript that did not even reach up to an acceptable level and highlight this finding. There are already sufficiently many articles that shows the poor quality of reporting but nevertheless is this information that deserves to be repeated even if it is not main message of the submitted manuscript. So I would expect that this is handled in the discussion section. Considering that it is not the main purpose with the manuscript it might not be motivated, however, to present these results in the conclusion of the manuscript but that is up to the author to think further on what can be done.

Suggestions that author decide if he wants to revise based on:
1) It would improve the reasoning around the items if references was used to back-up why they are important. One of the things I am skeptical to is to require that the test is specified in the table. I have always considered this to be something that the reader should check in the statistical analysis part if they are further interested in unless the choice of method in its uniqieness is fundamental for the results (which I do not consider the chisquare test to be) so that it can be excluded at times. It would be good to support this and similar demands on discussions by author researchers if such exist. The importance of specifying the confidence interval (mentioned for item 4) is for instance something that have been brought up by Gardner and Altman (Gardner MJ, Altman DG. Confidence-Intervals Rather Than P-Values -
Estimation Rather Than Hypothesis-Testing. Brit Med J. 1986;
292(6522):746–50.) and can be cited to support this demand. It it interesting that this was brought up already in 1986 and this further supports the need of a check-list such as suggested in the manuscript.

2) It would be valuable if the author could add reasoning around the role for both the authors and the journals in securing the quality of the publications. I lack a discussion about especially the role that journals have. To discuss around this would certainly give an even better argument as why at checklist is so extremely important in securing that research is correctly done and quality-secured (else wrong policy recommendations can be taken and influence health, well-being and even increase mortality!).

3) Item 7 for me is rather something that belongs to result as it is once after you done your analyses that you can verify if assumptions holds. Could therefore be worth considering to restructure Table 1.

4) In my own assessment of quality in reporting the thing that surprised me most was the lack of reasoning around the limitations of the statistical methods used. You can rarely find something that is fitting perfectly to your data but you should be able to reason around how well your method worked. Might be something worth considering in a checklist to see if authors can prove that they have a fair understanding on how well their methods worked and thereby how reliable their results are.

5) At end of first page of discussion it might be valuable to add the reference by Pouwels that looked closer into the STROBE guideline and how well journals using these fulfilled the criteria in STROBE.

Author Response

Response to Reviewer #2

I thank the reviewer for the constructive feedback of our manuscript “Ten points for high-quality statistical reporting and data presentation". I have resolved the issues mentioned by the reviewers.

 

I would like to thank the author for a great initiative and a well conducted study with a well-defined and thought through design and valuable end result. It is clearly needed ways to improve the review process, which is also apparent from a previous own manuscript that has been cited in the manuscript. The checklist is certainly good enough for adding valuable help in the review process already now. I do hope though that this could kickstart a project to have an agreed list of items, and as well criteria to consider them as fulfilled, between a larger group of scientists, suggestedly with the author in charge, to get it accepted by the scientific community in the same way as guidelines such as PRISMA, STROBE and RECORD..

Thank you. The project proposed by the reviewer is interesting. I agree that there is a need for this project.

 

I very much liked item 8. It is remarkable how poor authors are at giving references to methods. This needs to be improved. I liked the other items but I just want to mention this specifically as I rarely seen this highlighted in the literature, while others have commonly been brought up (not that it has helped enough for raising the quality though sadly).

I am very pleased that the reviewer advocated bringing this item up in the tool.

 

Revision required:
1) As you have defined 20 draft questions for the protocol, you should therefore clarify which ones these are as that certainly adds a value if someone wants to repeat your task of developing a protocol or even suggest refinements of it. That you started with 20 draft questions is a strength of the study.

I have now added these draft questions.

 

2) A minor thing is that I think that "fulfill the underlying assumptions" as written for item 7 is a too strong wording. It can never be proven that assumptions are fulfilled so it is rather that they are satisfactory fulfilled. This small thing might be important to be clearer about if the checklist should be possible to use to a satisfactory level. It is a somewhat subjective judgement but that is unavoidable as is most of the items in the checklist..

I appreciate this comment. I have now used “satisfactorily fulfill” instead of “fulfill”.

 

3) 40 references were chosen in each journal. Please be specific about the selection critieria. Was it random? Was it the first 40 spotted? Or how did you choose them?

I have now included the following text to clarify the selection of evaluated articles:

“The starting articles for each journal were chosen randomly from the journal’s chronological list of articles, with the only criteria being that there would be at least 39 eligible subsequent articles published that year in the journal in question. The following consecutive 39 articles were also included for the review.”

 

4) There is no table listing the results in relation to the total of 0-10. I would at least expect this to be done according the suggested five categories in section 2.2. The manuscript is mainly related to testing how well the new suggested instrument works. Still, I think that the actual research that have been conducted deserves to be better presented and reasoned around. It is actually high rates of manuscript that are not fulfilling the criteria and it would be very interesting and valuable for the field to know the proportion of manuscript that did not even reach up to an acceptable level and highlight this finding. There are already sufficiently many articles that shows the poor quality of reporting but nevertheless is this information that deserves to be repeated even if it is not main message of the submitted manuscript. So I would expect that this is handled in the discussion section. Considering that it is not the main purpose with the manuscript it might not be motivated, however, to present these results in the conclusion of the manuscript but that is up to the author to think further on what can be done.

I agree and are grateful to the reviewer for the helpful suggestion. I have now included a figure (Figure 1) showing the distribution of the quality score.

 

Suggestions that author decide if he wants to revise based on:
1) It would improve the reasoning around the items if references was used to back-up why they are important. One of the things I am skeptical to is to require that the test is specified in the table. I have always considered this to be something that the reader should check in the statistical analysis part if they are further interested in unless the choice of method in its uniqieness is fundamental for the results (which I do not consider the chisquare test to be) so that it can be excluded at times. It would be good to support this and similar demands on discussions by author researchers if such exist. The importance of specifying the confidence interval (mentioned for item 4) is for instance something that have been brought up by Gardner and Altman (Gardner MJ, Altman DG. Confidence-Intervals Rather Than P-Values -
Estimation Rather Than Hypothesis-Testing. Brit Med J. 1986;
292(6522):746–50.) and can be cited to support this demand. It it interesting that this was brought up already in 1986 and this further supports the need of a check-list such as suggested in the manuscript.

I appreciate this comment. I have now included more text and references to support reasoning behind the selected items. These also include Gardner and Altman (1986).

 

2) It would be valuable if the author could add reasoning around the role for both the authors and the journals in securing the quality of the publications. I lack a discussion about especially the role that journals have. To discuss around this would certainly give an even better argument as why at checklist is so extremely important in securing that research is correctly done and quality-secured (else wrong policy recommendations can be taken and influence health, well-being and even increase mortality!).

I have now included the following comment in the discussion:

“To clinicians and scientists, the literature is an important means for acquiring new information to guide health care research and clinical decision making. Poor reporting is unethical and may have serious consequences for clinical practice, future research, policy making, patient care and ultimately for patients [10]. Securing the quality of the publications is an important activity of journals in their editorial policy. Journals need to be more proactive in providing information about the quality of what journals publish [10,46]. Journals should consider strategies and actions to verify that authors realize full responsibility for the statistical reporting quality of their manuscripts. Use of the short checklist proposed in this paper with reporting guidelines would be an important step towards quality-secured research.”

 

3) Item 7 for me is rather something that belongs to result as it is once after you done your analyses that you can verify if assumptions holds. Could therefore be worth considering to restructure Table 1.

I agree that Item 7 is related with the results. One of my key principles was to evaluate only tables and figures in Results section. Assumptions of statistical tests are not verified in tables and figures. Guidelines recommend reporting their validity in the Statistical analysis subsection. I thus prefer to keep Item 7 in the Material and methods section.

 

4) In my own assessment of quality in reporting the thing that surprised me most was the lack of reasoning around the limitations of the statistical methods used. You can rarely find something that is fitting perfectly to your data but you should be able to reason around how well your method worked. Might be something worth considering in a checklist to see if authors can prove that they have a fair understanding on how well their methods worked and thereby how reliable their results are.

Thank you for drawing my attention to the robustness of results. It is important that authors comment about the statistical limitations of their study. Usually limitations of the statistical methods and robustness of results are discussed in addition to other limitations in the Discussion section, not in the Statistical methods subsection or in the Results section. This is an item that could be added to an extended version of my checklist.

 

5) At end of first page of discussion it might be valuable to add the reference by Pouwels that looked closer into the STROBE guideline and how well journals using these fulfilled the criteria in STROBE.

I have now given a reference to paper of Pouwels et al.

Reviewer 3 Report

I find this manuscript interesting and valid. However, the position in the literature is not sufficiently developed.

The author proposes a checklist-type instrument by selecting and refining items from previous reports about the quality of statistical reporting in medical journals and from published guidelines for reporting and data presentation. A total of 160 original medical research articles that were published in 4 journals were evaluated to test the instrument. A high score suggested that an article had a good presentation of findings in tables and figures and that the description of analysis methods was helpful to readers.

Interrater and intrarater agreements were examined by comparing quality scores assigned to 40 articles published in a psychiatric journal.  

 I am really interested in this kind of instruments that can be used as an initial indicator of research quality. Using them  we might easily check whether, based on numerical data analyses, a published research article is readable, understandable and accessible to healthcare professionals.

In fact, in health sciences and medical research, data analysis methods have become an essential part of empirical research. This work proposes an applicable checklist for quickly testing the statistical reporting quality of manuscripts. This instrument aims to improve the quality of empirical research in scientific fields where statistical methods play an important role.

In my opinion, the manuscript's proposal is relevant to increase the quality of peer review processes in academic communication. It provides an operational procedure for evaluating submissions in terms of their numerical data analysis.   I myself have used the proposed test instrument in evaluating this submission. The total score was close to 10. Therefore, it could be accepted following the author's proposal.

The weakness of the manuscript is its lack of ties to the literature (1) on models of peer review; and (2) on reviewer strategies in journal peer review. Existing literature has several examples of research on peer review where models have adopted some assumptions or modeling choices also used in this paper. I think that more work is needed to acknowledge this literature and identify what's novel in this work.

Author Response

Response to Reviewer #3

I thank the reviewer for the constructive feedback of our manuscript “Ten points for high-quality statistical reporting and data presentation". I have resolved the issues mentioned by the reviewers.

 

I find this manuscript interesting and valid. However, the position in the literature is not sufficiently developed.

The author proposes a checklist-type instrument by selecting and refining items from previous reports about the quality of statistical reporting in medical journals and from published guidelines for reporting and data presentation. A total of 160 original medical research articles that were published in 4 journals were evaluated to test the instrument. A high score suggested that an article had a good presentation of findings in tables and figures and that the description of analysis methods was helpful to readers.

Interrater and intrarater agreements were examined by comparing quality scores assigned to 40 articles published in a psychiatric journal. 

I am really interested in this kind of instruments that can be used as an initial indicator of research quality. Using them  we might easily check whether, based on numerical data analyses, a published research article is readable, understandable and accessible to healthcare professionals.

In fact, in health sciences and medical research, data analysis methods have become an essential part of empirical research. This work proposes an applicable checklist for quickly testing the statistical reporting quality of manuscripts. This instrument aims to improve the quality of empirical research in scientific fields where statistical methods play an important role.

I am grateful to the reviewer for the positive view of the proposed instrument.

In my opinion, the manuscript's proposal is relevant to increase the quality of peer review processes in academic communication. It provides an operational procedure for evaluating submissions in terms of their numerical data analysis.   I myself have used the proposed test instrument in evaluating this submission. The total score was close to 10. Therefore, it could be accepted following the author's proposal.

Thank you.

The weakness of the manuscript is its lack of ties to the literature (1) on models of peer review; and (2) on reviewer strategies in journal peer review. Existing literature has several examples of research on peer review where models have adopted some assumptions or modeling choices also used in this paper. I think that more work is needed to acknowledge this literature and identify what's novel in this work.

Thank you for drawing my attention to the models of peer review and reviewer strategies. I have now cited several recently published articles which are relevant to my study. I have also added the following text to the Discussion section to clarify the role of statistical review in peer review models:

“Leading medical journals, such as Lancet, BMJ, Annals of Medicine and JAMA have adopted statistical review. Despite demonstration of widespread statistical and data presentation errors in medical articles, increasing the use of statistical reviewers has been slow [50]. A recent survey found that only 23% of the top biomedical journals reported that they routinely employed statistical review for all original research articles [51]. Introduction of specialist statisticians to the peer review process has made peer review more specialized. In addition, statistical reviewing is time intensive, limited by both reviewer supply and expense.

In biomedical journals, there is no single model for statistical review in peer review strategies [52–54]. Some journals recruit statistical methodologists to the editorial board, some draw their statistical reviewers from an external pool. If all papers cannot be statistically reviewed, editors have to select which manuscripts should undergo statistical scrutiny. There are also models where subject reviewers are assisted to comment on the statistical aspects of a manuscript [48,49,55]. However, these checklists cover extensively all aspects of data analysis and are not straightforward for non-statistical reviewers to get an overall impression of the statistical quality.”

“In recent years, several journals have tried to improve peer review processes [54]. Their efforts have been focused on introducing openness and transparency to the models of peer review. New strategies in peer review might help to address persistent statistical reporting and data presentation issues in the medical literature [54]. Software algorithms and scanners have been developed to assess internal consistency and validity of statistical tests in academic writing [55]. However, their use is still rare and limited to flag specific potential errors. The open peer review, where all peer reviews are made openly available brings into use new models where the quality of a paper may be assessed by the whole scientific community. The pre- and post-publication peer review models include commenting systems for the readership. Readers could use tools, such as proposed in this paper, to give feedback to authors. Subsequently, authors prepare a second version of the manuscript reflecting the comments and suggestions proposed by the scientific community.”

 

Round 2

Reviewer 1 Report

The paper recommend a series of procedures for quality control of statistic reporting. It is a good practice paper in this own domain. However, the paper didn't analyse what are the causes for static reporting in details and how to tackle them. This is the fundamental questions for why and how to do this work. I understand that reasons vary and could be a wide spectrum of reality issues that were not possible to solve. But the paper should provide a comprehensive analysis. Otherwise, it is like to deal with the consequence based on the consequence, which should have been the other way around. 

Back to TopTop