Correction: Spilz, A.; Munz, M. Automatic Assessment of Functional Movement Screening Exercises with Deep Learning Architectures. Sensors 2023, 23, 5

Andreas Spilz; Michael Munz

doi:10.3390/s25134110

and

Research Group Biomechatronics, University of Applied Sciences Ulm, 89081 Ulm, Germany

^*

Author to whom correspondence should be addressed.

Sensors2025, 25(13), 4110;https://doi.org/10.3390/s25134110

This article belongs to the Section Biomedical Sensors

Version Notes

Order Reprints

Text Correction

There was an error in the original publication [1]. We would like to adjust the numerical values in Section 3.2. Specifically, we identified a rounding error in the following sentence, which could lead to a misunderstanding regarding our selection of optimal hyperparameters. This rounding issue might create the impression that we determined the best hyperparameters based on test set performance, which is not the case. Instead, we selected the run with the highest validation performance as the optimal configuration.

A correction has been made to Section 3.2:

The optimal parameter combination is selected using the averaged performance from the 5-fold CV. Considered is the averaged macro F1-score achieved on the validation dataset. The best performance was achieved by a network with the following hyperparameters: batch size 32, three CNN-blocks with the “increasing filter, fixed kernel size” scheme, dropout with a rate of 0.2 as a regularization technique, and two LSTM layers. This combination results in a macro F1-score of 0.956 on the training set, 0.955 on the validation set, and 0.906 on the test set. In all tested configurations, macro F1-scores were achieved in the ranges of 0.932–0.969 (training set), 0.939–0.955 (validation set), and 0.867–0.906 (test set).

The Academic Editor has also instructed us to round all other entries of the same metric in the text to 3 decimal places so that this is consistent throughout the publication. Therefore, there are additional corrections in other sections, which I will list in the following.

A correction has been made to Section 3.3:

Regardless of the number of CNN blocks and the CNN structure used, the macro F1-score on training (0.935–0.973), validation (0.948–0.96), and test (0.877–0.901) sets are found to be close to the optimum.

A correction has been made to Section 4.5:

For example, the HS variants achieved a macro F1-score of 0.582–0.729 on the training dataset but a weighted F1-score of 0.821–0.856. Based on the rating distribution of these variants, one can see that rating “1” is underrepresented, and accordingly, a misclassified example of rating “1” influences the macro F1-score significantly more than misclassified examples of the other classes. In contrast, the influence of a misclassified example on the weighted F1-score is independent of the rating. A good example is the macro F1-score of the HS left dataset: The exercise has a score of 0.582 on the training dataset; at the same time, there are hardly any examples for rating “1”, which therefore has a huge impact on the score.

A correction has been made to Section 4.7:

The macro F1-score per exercise is improved by 0.04 (DS), 0.245 (IL), 0.205 (HS), and 0.054 (TSP).

As the majority of instances of the described metric appear in tables, we also have to add the additional third decimal place to three tables:

A correction has been made to Table 2:

CNN-Blocks	IMU-Specific (Train/Validation/Test)	Channel-Specific (Train/Validation/Test)	Baseline (Train/Validation/Test)
1	0.945/0.951/0.891	0.96/0.96/0.9	0.936/0.956/0.894
2	0.96/0.958/0.9	0.959/0.948/0.896	0.973/0.959/0.881
3	0.954/0.956/0.896	0.935/0.949/0.877	0.952/0.953/0.901

A correction has been made to Table 3:

Dataset	Training Set	Validation Set	Test Set
Hurdle Step	0.686 ± 0.045	0.679 ± 0.049	0.645 ± 0.049
Hurdle Step right	0.729 ± 0.037	0.755 ± 0.062	0.687 ± 0.041
Hurdle Step left	0.582 ± 0.041	0.566 ± 0.019	0.546 ± 0.018
Inline Lunge	0.877 ± 0.037	0.862 ± 0.05	0.825 ± 0.023
Inline Lunge right	0.863 ± 0.044	0.815 ± 0.062	0.84 ± 0.037
Inline Lunge left	0.868 ± 0.012	0.846 ± 0.05	0.849 ± 0.046
Trunk Stability Pushup	0.953 ± 0.027	0.897 ± 0.043	0.914 ± 0.037
Deep Squat	0.941 ± 0.029	0.948 ± 0.014	0.9 ± 0.021

A correction has been made to Table 4:

Dataset	Training Set	Validation Set	Test Set
Hurdle Step	0.816 ± 0.019	0.821 ± 0.015	0.301 ± 0.284
Hurdle Step right	0.854 ± 0.04	0.792 ± 0.021	0.267 ± 0.258
Hurdle Step left	0.821 ± 0.019	0.868 ± 0.022	0.405 ± 0.409
Inline Lunge	0.912 ± 0.031	0.88 ± 0.026	0.331 ± 0.177
Inline Lunge right	0.859 ± 0.03	0.806 ± 0.017	0.442 ± 0.353
Inline Lunge left	0.884 ± 0.026	0.813 ± 0.036	0.498 ± 0.347
Trunk Stability Pushup	0.953 ± 0.022	0.954 ± 0.01	0.154 ± 0.318
Deep Squat	0.978 ± 0.01	0.953 ± 0.007	0.485 ± 0.427

The authors state that the scientific conclusions are unaffected. This correction was approved by the Academic Editor. The original publication has also been updated.

Reference

Spilz, A.; Munz, M. Automatic Assessment of Functional Movement Screening Exercises with Deep Learning Architectures. Sensors 2023, 23, 5. [Google Scholar] [CrossRef] [PubMed]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Correction: Spilz, A.; Munz, M. Automatic Assessment of Functional Movement Screening Exercises with Deep Learning Architectures. Sensors 2023, 23, 5

Text Correction

Reference

Article Metrics

Citations

Article Access Statistics