Next Article in Journal
Optimization of Selection and Use of a Machine and Tractor Fleet in Agricultural Enterprises: A Case Study
Next Article in Special Issue
Algorithm for Enhancing Event Reconstruction Efficiency by Addressing False Track Filtering Issues in the SPD NICA Experiment
Previous Article in Journal
A Domain-Adaptive Tree-Crown Detection and Counting Method Based on Cascade Region Proposal Networks
Previous Article in Special Issue
A Multithreaded Algorithm for the Computation of Sample Entropy
 
 
Article
Peer-Review Record

Probability Density Estimation through Nonparametric Adaptive Partitioning and Stitching

Algorithms 2023, 16(7), 310; https://doi.org/10.3390/a16070310
by Zach D. Merino 1,2,†,‡, Jenny Farmer 3,‡ and Donald J. Jacobs 2,*,§
Reviewer 1:
Reviewer 2: Anonymous
Reviewer 3:
Algorithms 2023, 16(7), 310; https://doi.org/10.3390/a16070310
Submission received: 8 May 2023 / Revised: 11 June 2023 / Accepted: 16 June 2023 / Published: 21 June 2023
(This article belongs to the Collection Parallel and Distributed Computing: Algorithms and Applications)

Round 1

Reviewer 1 Report

The manuscript presents a new nonparametric adaptive splitting and stitching algorithm for estimating the probability density function of one variable. The topic is relevant in the field. It is shown by the authors of the manuscript that the NPS (NET) method is the most effective method in general, while NAPS (KDE) improves estimates near the boundary compared to standard KDE. Previously published KDE methods suffer from two inherent disadvantages, which in practice have been eliminated by the authors of this article. The method proposed by the authors does not require prior knowledge of the characteristics of the input data, it is free from bandwidth selection, and the user does not need to set any parameters. Conclusions are consistent with the evidence and arguments and they address the main question posed. The references are appropriate. The tables and figures are made in high quality and appropriate to the topic.

Author Response

We thank the reviewer for such a positive review. Our revisions for the manuscript are in blue text for your convenience. We added more than 5 new pages, which includes two new figures and a table. Note that our full summary of changes in the manuscript are attached in a reply to reviewer 4. 

Reviewer 2 Report

The problem of PDF estimation has a long history however it is actual nowadays. As I think the algorithm proposed by the Authors is well organized and effective. The paper, in geberal, is well written and sounds mathematically. I have no essential critical remarks on the manuscript. The bibliography is prepared carefully. It will be interested for me, from the practical point of view in potential unsing the algorithm, is it sufficiently precies in the case of extreme value probability distributions, like Gumbel or Frechet PDFs. Some words about this problem could be introduced into the final version of the paper. Moreover, I recommend a detail proofreading of the paper to avoid potential linguistic errrors and typos.

(2)

Author Response

We express our gratitude to the reviewer for their positive feedback. We appreciate the suggestion to include the Gumbel and Frechet distributions, known for their extreme values and heavy-tailed characteristics. In response to this suggestion, we have incorporated a total of six new distributions, including these two. To aid in the review process, all our revisions have been highlighted in blue text for easy identification. These revisions have resulted in an addition of more than five pages, encompassing two new figures and a table. Note that our full summary of changes in the manuscript are attached in a reply to reviewer 4. 

Reviewer 3 Report

REVIEW

Title of the paper: Probability Density Estimation through Nonparametric Adaptive Partitioning and Stitching

Manuscript Number: algorithms-2413237

General conclusion: Major Revision.

Comments

Many thanks for a very relevant paper. In this paper, a novel nonparametric adaptive partitioning and stitching algorithm to estimate a probability density function of a single variable is presented. This paper contains an interesting proposal; my overall impression is that the manuscript presents some results that could be useful in practice. I recommend its publication but before this work needs a full revision in different areas such as:

 

1.    In this paper, there are 14 figures, figures 1, 3, 5, 8, 9, 12, 13 and 14 are very good, but author figures are very poor, all figures should be in eps format.

2.    In figure 2, the authors chose some distribution such as normal, trimodal normal, generalized Pareto, etc. Why the authors chose this distribution? In addition, some distribution such as slash distribution and double slash distribution should be added because this distribution are heavy tail distribution more than the normal distribution.

3.    The caption of figure 3 is very long, this caption should be smaller, and some description must be added in the paragraph after this figure.

4.    In page 7, line 276, the words "Section 3.5" should be replaced by "Subsection 3.5".

5.    In page 8, the collection of estimates for each of the blocks in both layers 1 and 2 for a tri-modal distribution are introduced which of them are better?

6.  In page 9, figure 5a, the Y axis represents the  the red curve gives   please clarify whether or not this is true?

7.    In page 10, Som description about Algorithm 1 and 2 should be added.

8.    In figure 6(a, b), the pdfs are decreasing and become increasing  please clarify whether or not this is true?

9.    The end of equation 7 should be "." as equation 6.

10.  Some numerical tables should be added, figures cannot be fully reliable without numerical calculations.

11. Check for typographical errors.

12.  The conclusion Section must be modified where some results along the paper should be added. Please justify the conclusion section.

Comments for author File: Comments.pdf

Minor editing of English language required

Author Response

Thank you for your positive review and excellent suggestions. Our revisions for the manuscript are in blue text for your convenience. We added more than 5 new pages, which includes two new figures and a table. Note that our full summary of changes in the manuscript are attached in a reply to reviewer 4. 

 

There was a suggestion that we Increase the introduction to discuss more broadly previous work. The changes should address the bandwidth issue, and the resolution issue, which will always be present for any numerical estimator.

We have taken the reviewer's suggestions into careful consideration and made several improvements to the manuscript. Specifically, we have significantly expanded the introduction section to provide more in-depth discussions on the bandwidth issue in KDE and the specific challenges associated with maximum entropy methods. We have included relevant references to support these discussions. Additionally, we have provided a detailed explanation of how our method differs from the standard MaxEnt methods. Furthermore, we have expanded on the prior work related to the parallelization of density estimation algorithms. These enhancements have strengthened the manuscript by providing a more comprehensive and informative introduction.

 

There was a suggestion to clarify if R is a general statistic or something specific to uniform distribution.

Thank you for bringing this issue to our attention. Upon revisiting the section in question, we acknowledge that the original presentation may have inadvertently led to confusion regarding the nature of R. While our methodology remains unchanged, we recognize the need for further clarification to alleviate any potential ambiguity. Consequently, we have expanded the section to provide a more thorough explanation and address this concern. We sincerely appreciate your valuable suggestion, which has helped us improve the clarity of our manuscript.

 

There was a suggestion to remove our key reference in the abstract.

We did this.

Reviewer 4 Report

Why is there a reference in the Abstract? That needs to be removed.

lines 31-40. you discuss the user issue of selection of a bandwidth, but is that not a prevalent issue of all data analytical methods? Even with methods which require no user input they do induce prior assumptions with implicit parameters. Please discuss this a bit more in the general scheme as this is ubiquitous

line 197, R is declared as a very general term as a statistical measure, and then specifically defined in eq4, which one is it? What does R represent? We see it later as a ratio related to the difference set.

What are other recent approaches to this problem that can be compared to? The literature review citations cover the challenge but not what else have been done on this question.

 

good

Author Response

Thank you for your suggestions, and careful reading of the manuscript to make specific changes. Our revisions for the manuscript are in blue text for your convenience. We added more than 5 new pages, which includes two new figures and a table. We attach a PDF that summarizes all our changes. 

Author Response File: Author Response.pdf

Round 2

Reviewer 3 Report

Thanks for a good response to my comments.

 

Minor editing of English language required

Back to TopTop