Review Reports - Fall Detection with CNN-Casual LSTM Network

Round 1

Reviewer 1 Report

The study presented a Casual-CNN with LSTM for elderly fall detection using the benchmark SisFall dataset. The proposed method sounds interesting, but the manuscript needs a major rewrite, especially in results and discussion.

The literature review lacks discussion of existing studies, there are instead some discussions focused on image processing, but not much with wearable sensors.
In data acquisition, it is ambiguous and confusing to state that data is obtained with MPU6050 whereas later stated that the study is actually using SisFall dataset. Do clarify.
Do provide citation(s) for "according to statistics, human falling time is generally less than 2s", also, clarify if there is any difference in falling for different age categories, e.g., elderly, teenager, etc.
The casual LSTM is not the original work by the authors, and thus detailed descriptions can be simplified, and only highlighted those importances for discussion.
The ablation experiments lack details, especially the specific implementation and setup, and how results are obtained.
There isn't any content on the discussion of the results and lacks comparisons with existing studies. Specifically, the comparison should be focusing not only on the algorithms but also should target the same elderly fall detection or other studies that used the same dataset in the experiments.
Related to the dataset, the dataset imbalance is significant where fall data only comprised approximately 27% of all the data. The studies should resolve this issue.
Conclusions stated that the detection methods are firstly evaluated, but difficult to identify in the manuscript. Lack of analysis of the deficiencies of existing fall detection methods.
The conclusion also needs a major rewrite which should not be duplicated with contents in the abstract and should summarize the work properly.
Suggest to include future work and limitations of the proposed method.
The paper should also highlight and specify the contributions.
Lastly, the English grammar needs major revision, there are many duplicate contents and the structure needs a major rewrite.

If rejected, advise submitting to other more suitable journals.

Author Response

See attachments

Author Response File: Author Response.docx

Reviewer 2 Report

This paper present a fall detection methodology using a CNN-Casual LSTM network with three-axis acceleration and three-axis rotation angular velocity inputs. The algorithm was tested on the public dataset SisFall and the results were impressive (accuracy of 99.79%).

Here are some specific comments:

The references in the related work section could be improved. For instance, the vision based references [9-11] are very limited.
Wearable sensors are not always accepted by the elderly, need batteries and cannot always be worn (e.g. in the bath or shower). The authors should mention these drawbacks to be fair when comparing with vision and ambient sensors.
The 2-second pieces of data are not overlapping. This means that if a fall is occurring at the end of one piece and the beginning of the next piece it could be more difficult to detect (am I right?). So why not using overlapping pieces of data? To be discussed.
The mathematical notation in the paper is not always appropriate and consistent. For instance x is used in equation 1 and X is used in the text which could be confusing.
meter types => measurement types?
Why are you creating a 20 x 20 structure for 1D data? This creates a 2D structure from a 1D structure which is not natural and could influence the CNN. Moreover, is there any reason for not using 10 x 40 or 40 x 10 or 16 x 25 etc. instead? Please clarify this.
Equations (1)-(4) are poorly described, maybe you could use vector transpose to better describe the structure? I don't see clearly the difference between Equations (1)-(2) and (3)-(4) since the features are already arranged based on measurement types. In conclusion, Section 3.2 should be improved.
A figure of the whole network should be provided in section 3.3.
Causl => Causal
"the performance of Casual-LSTM has been greatly improved" I think that the word "greatly" is much too strong.
"the training process has large fluctuations, and the experimental results have underreported, which is absolutely not allowed." I am not sure to understand what you mean by "not allowed" and "underreported". You could maybe rewrite this sentence.
You should give more information and try to explain the 3 false positives i.e. why they were false positive? by looking at the input data.
"We select RNN and LSTM[31] and a convolutional neural network..." => "We selected RNN and LSTM[31] and a convolutional neural network"
Please check carefully the References sections. The references are not always well presented. E.g. characters [C]// appearing in some places, incomplete reference (e.g. [32]).

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Reviewer 3 Report

The paper describes the approach for fall classification.

Paper is well-structured. The material is given in understandable form with good explanation.

The following suggestions are given:

1) Please check the formatting of the paper according to MDPI requirements.

2) Please add the aim of the paper and the main contribution

3) Evaluation metrics subsection (page 9) is well-known and can be removed.

4) Please improve the quality of Fig. 1

5) Please compare the accuracy of your approach with the existing one.

6) The conclusion section should be extended. The limitation of the work should be presented.

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Round 2

Reviewer 3 Report

The authors took into account my comments. The results section looks good and with good explanation. I haven't additional propositions.