Development and Explainability of Models for Machine-Learning-Based Reconstruction of Signals in Particle Detectors
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThis paper explores the use of modified autoencoder architectures and convolutional neural networks for determining signal parameters in high-energy physics experiments. It also includes an upsampling version of the model to enhance precision in results.
I found this study intriguing for its application of machine learning to solve complex challenges. The methodologies and insights are clearly presented and are quite impactful, making this a great contribution to the journal. I recommend its publication in Particles. I do have one question for the authors: beyond the standard mean square error, have you considered other forms of error metrics, such as absolute percentage errors or mean square errors with regularization? Could these alternatives potentially improve the model's performance?
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 2 Report
Comments and Suggestions for AuthorsDear Authors,
the paper is well written and structured.
However, a more detailed quantitative comparison with other data analysis methods is missing.
You can also explain better the percentages of improvement compared to traditional methodologies.
Comments for author File: Comments.pdf
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 3 Report
Comments and Suggestions for Authors—L21: perhaps an appropriate reference would be https://link.springer.com/article/10.1007/s41781-021-00066-y
— L44: why was a convolutional auto encoder chosen here? The reference (and the typical use standard) is for images; do the authors have any studies of alternative layers here (some type of recurrence or even just dense) that could be describe in words here to motivate the choice?
— L47: I think the “modified autoencoder” needs more description here: I don’t understand even after reading the reference. Are you simply changing the inputs that you are auto-encoding over, eg. now you reconstruct the array of 0s + amplitude where peaks are? If so I wouldn’t really describe this as “modified”, or even that you’re using labels, rather that it’s a regular auto encoder with a different input modeling.
— L72: what is the value of the fixed length that you used?
— I don’t understand the impact of the occlusion sensitivity studies based on what is described here. If you mask away the peaks, and the model evaluates over that event, what am I supposed to make of the fact that the loss spikes? That means that window would have been more poorly reconstructed? But if it’s trying to reconstruct a sequence of 0s maybe that explains the behavior? More verbiage is needed here around L81-85, 91-100. I also think some plots showing the mask and what the masked inputs look like would be helpful as an addition to Fig. 2 and 3.
— L103-104: I don’t understand the sentence “higher precision than the rounded values”; I don’t see how the Occlusion Sensitivity studies indicate this. Please spell it out.
—Fig. 4: I don’t understand this at all. So the time labels are quantized to single ns? Why? What does this have to do with Occlusion Sensitivity? Why is the timing differences asymmetric around 0; does this indicate some systematic over/underprediction?
English seems fine though I find it hard to follow or obtain information from the explanations.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf
Round 2
Reviewer 3 Report
Comments and Suggestions for AuthorsDear authors,
Thanks for your work in preparing a new draft with lots of new and useful information! And thanks for the extra text explanations (and for your patient explanations in the comments), I think they help the clarity of the manuscript quite a bit. I provide a few new comments below.
“— L47: I think the “modified autoencoder” needs more description here: I don’t understand even after reading the reference. Are you simply changing the inputs that you are auto-encoding over, eg. now you reconstruct the array of 0s + amplitude where peaks are? If so I wouldn’t really describe this as “modified”, or even that you’re using
labels, rather that it’s a regular auto encoder with a different input modeling.
The array of 0s + amplitude where peaks are is set as a “label” in the sense that the desired
output of the network is not the same array as the input like in the classical autoencoder case
but this ‘’modified” version of the input. The input layer which is fed to the model to encode
over is still the raw waveform of 1024 non-zero values, as would happen when applying the
model to real data coming out of a digitizer. This was clarified in the text, as well as in the
description of the left panel of Figure 1, where an example event is shown.”
—> So this is an autoencoder that takes samples in and generates a list of times out? How do you train the autoencoder then; what is the loss since reconstruction error doesn’t apply? How can the model trained with the loss in Eq. 2 output the time stamps of peaks in the waveform? Later at L215 I think you mean that you do some trainings with the waveform being both input and reco’d output, and a separate model with the arrival times being both input and reco’d output; did I understand well that you don’t ever reconstruct something different than the input? Maybe this part should be moved earlier?
I still think the term “label” is very confusing and should be changed, as it’s well-known in the machine learning literature to refer to a value that describes the truth source of an input that is given to the model during training for back propagation.
—L89: “hey” missing a “T” at the beginning of
— Fig 3: these examples of the masked data are helpful, thanks! But in the caption you say the mask is of size 18 but the plot has the “masked data” taking up the whole wave form with just a small sliver around 350 “unmasked”. So these labels are still confusing. Wouldn’t the “masked data” be the data that is within the mask?
— Fig. 4: maybe appropriate to overlay the corresponding waveform on this plot so we can correlate the behavior of the loss to the rise and fall of the waveform?
— Fig. 8: I would add to the caption a reminder of the fact that the positive mean of the MAC comes from errors related to the rounding, otherwise it stands out as a mean offset with no clear reason.
Author Response
Please see the attachment.
Author Response File: Author Response.pdf