# Dance Tempo Estimation Using a Single Leg-Attached 3D Accelerometer

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Materials and Methods

#### 2.1. Data Acquisition

#### 2.1.1. Materials

#### 2.1.2. Measurements

#### 2.2. Signal Processing

#### 2.2.1. Signal Pre-Processing

_{co}= 50 Hz and finally performed downsampling to f

_{s}= 100 Hz, obtaining 3D acceleration at equidistant time samples, T = 1/f

_{s}= 0.01 s.

_{x}, a

_{y}, and a

_{z}. The specific orientation of the axes in a reference coordinate system is irrelevant.

#### 2.2.2. Dance Tempo Estimation

_{υ}. In the most simplified scenario, for each music beat, one dance step is executed. Measuring T

_{υ}in seconds and υ in beats per minute, we can write

_{υ}denotes the fundamental dance frequency, related to step execution and measured in Hz. Depending on step styling, a number of f

_{υ}harmonics are present, to a varying extent. In the given context, we expect f

_{υ}to be dominant over its harmonics.

_{step}= 2T

_{υ}, introducing f

_{step}= f

_{υ}/2 as the fundamental frequency. We can write

_{step}denotes the single leg step fundamental frequency, measured in Hz. In addition, for each h component of the original signal, hf

_{υ}± f

_{step}are now present to a various extent. In such a simplified scenario, by detecting steps, by means of feature extraction in the time-domain, dance tempo can be estimated.

_{step}± f

_{step}/2, making f

_{step}/2 the fundamental frequency. Likewise, when a leg change occurs every four beats, the spectral content is enriched for components hf

_{step}± f

_{step}/4, making f

_{step}/4 the fundamental spectral component. More complex rhythmical variations, such that in one move, different leg activation patterns mix, are also possible.

_{step}, f

_{step}/2, or f

_{step}/4. Therefore, we can conclude that f

_{step}is the lowest common component, regardless of the performed assemble. Its intensity relative to other components varies. For longer and diverse assembles of moves, we can expect f

_{step}to be the maximum frequency component. However, this is far from a straightforward conclusion for short dancing excerpts or assembles with a repeating pattern of a small number of moves. For the considered υ range 80–220 bpm, f

_{step}is between 0.67 and 1.83 Hz.

_{step}, we rely on multiple resonators implemented with IIR comb feedback filters. Opting for the comb feedback filter is reasoned with the filter’s periodic frequency response—a feedback comb filter resonates at f

_{comb}and all its harmonics. Using dance acceleration signals as filter inputs, a comb filter produces the highest energy of the output when the resonating frequencies match f

_{step}and its harmonics. By analysing and comparing outputs of multiple filters, each with a different resonating frequency, we can estimate the most likely value of the step frequency and therefore the dance tempo. Such an approach has already been proven to enable estimation of a song’s quarter note tempo [17], which is essentially dictating the dancer’s dance tempo.

_{min}= 50 to k

_{max}= 200 samples. Each k identifies the first peak of the magnitude response of the filter:

_{s}= 100 Hz and the set k range, f

_{comb}is increasing non-uniformly from 0.5 to 2.00 Hz. This range translates to, considering f

_{step}instead of f

_{comb}in Equation (2), the extended dance tempo υ range of interest, 60–240 bpm (with ever larger increment steps: 60.00, 60.30, 60.61, …, 235.29, 240.00 bpm). To note, the number of filters in the bank can be adjusted with respect to the plausible dance tempo range. Lowering the number of filters considered, by considering a limited tempo range can be particularly beneficial when optimizing run-time execution.

_{x}, a

_{y}, and a

_{z}, and for each delay k, the respective filter outputs a

_{xf k}, a

_{yf k}, and a

_{zf k}for each time sample 1 ≤ n ≤ N is calculated according to the following implementation equations:

_{comb}], i.e., the energy of the output for equidistant values f

_{comb}, corresponding to 60, 61, 62, …, 239, and 240 bpm dance tempos.

_{max}of the output energy maximum:

_{step}is the dominant frequency component and should match the frequency of the filter with the maximum energy output. By setting

_{step}/2 than for f

_{step}. If f

_{step}/2 falls into the considered range for which e[f

_{comb}] is calculated and the dance step frequencies are plausible, considering Equation (8) leads to an underestimate of the dance tempo. Precisely, for moves with f

_{step}/2 as the fundamental frequency, performed at υ and those with f

_{step}/4 as the fundamental frequency, performed at 2υ, the resulting comb filter responses are very much alike and it is not possible to uniformly estimate f

_{step}for both such cases, using Equations (4)–(8) only. Further on, for solo jazz dance moves with a leg change occurring every four beats, the comb filters tend to resonate stronger for f

_{step}/4 than for f

_{step}, leading again to an underestimate of the dance tempo if f

_{step}/4 falls into the considered comb filter frequency range.

_{υ}, resulting from Equations (2) and (7), i.e., 2f

_{max}, to its harmonics. We expect f

_{υ}to be dominant.

_{max}multiples that fall into the considered frequency range, we check the significance of the associated value of the energy vector, relative to the significance of e [f

_{max}], by considering the difference between its value and the minimum value in its neighbourhood, according to

_{comb}range. If Equation (9) holds for any multiple m, we consider the multiple that is closest to the upper limit of f

_{comb}and denoted with Mf

_{max}, as the new f

_{step}candidate, leading to a final Mυ

_{est}dance tempo estimate. We make the final decision between υ

_{est}and Mυ

_{est}by checking the associated fundamental dance frequency components’, i.e., 2f

_{max}and 2Mf

_{max}, relative dominance. We applied two IIR two poles resonators, one for each of the candidate frequencies, 2f

_{max}and 2Mf

_{max}, to each component of the analysed acceleration signals:

_{max}and 2Mf

_{max}with e

_{r}[2f

_{max}] and e

_{r}[2Mf

_{max}], respectively. If

_{r}is the scaling factor again set to 0.25, determined empirically as the best suit, we correct f

_{step}by setting

#### 2.3. Validation

_{s}= 100 Hz, these values translate to 600–2400 samples.

^{3}samples long, we obtained, on average, slightly under 30 × 10

^{3}testing excerpts for each dance tempo and approximately 480 × 10

^{3}excerpts altogether for all 16 sequences.

_{ref}is known in advance. The reference tempo can be given either by the known tempo of the song or the target dance tempo. Considering that the dance tempo does not change abruptly and significantly, the reference tempo can also be estimated by analysing previous, longer excerpts of assembled moves. If shorter dance excerpts are to be analysed offline, the already established overall dance tempo can also be used.

## 3. Results and Discussion

#### 3.1. Overall Dance Tempo Estimation

_{step}= 0.83 Hz and 1.67 Hz, respectively. The left column depicts the amplitude frequency spectrums of the analysed sequences, along with the tuned comb filter’s magnitude response. All plots are normalized to fit into the [0,1] range.

_{step}, representing half of the dance tempo. Besides this frequency, various other components are also distinguishable, including f

_{step}/2, f

_{step}/4. The presented frequency representation reflects the variability of solo jazz moves and leg activation patterns and is aligned with the reasoning and expectations presented in the previous section.

_{comb}range. For both sequences, the maximum energy output is obtained for f

_{max}= 0.83 Hz. Considering f

_{max}as the step frequency would result in a dance tempo estimate of 100 bpm. We can further observe that for both sequences, the energy output has a significant peak at 2f

_{max}, giving the possibility of a 200 bpm estimate. From the presented signal frequency content for the first analysed sequence, it is visible that the component 100 bpm is dominant over the 200 bpm component. The resonator-based analysis performed according to Equations (9)–(11) confirms this and discards 2f

_{max}as the step fundamental frequency.

_{max}as the step frequency. In both cases, correct estimates are obtained, matching perfectly the tempo dictated by the metronome.

#### 3.2. Dance Tempo Estimation for Short Excerpts

## 4. Conclusions

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Conflicts of Interest

## References

- Kyan, M.; Sun, G.; Li, H.; Zhong, L.; Muneesawang, P.; Dong, N.; Elder, B.; Guan, L. An Approach to Ballet Dance Training through MS Kinect and Visualization in a CAVE Virtual Reality Environment. ACM Trans. Intell. Syst. Technol.
**2015**, 6, 1–37. [Google Scholar] [CrossRef] - Aich, A.; Mallick, T.; Bhuyan, H.B.G.S.; Das, P.; Majumdar, A.K. NrityaGuru: A dance tutoring system for bharatanatyam usingkinect. In Computer Vision, Pattern Recognition, Image Processing, and Graphics; Rameshan, R., Arora, C., Dutta Roy, S., Eds.; Springer: Singapore, 2018; pp. 481–493. [Google Scholar] [CrossRef]
- Dos Santos, A.D.P.; Yacef, K.; Martinez-Maldonado, R. Let’s dance: How to build a user model for dance students using wearable technology. In Proceedings of the 25th Conference on User Modeling, Adaptation and Personalization, Bratislava, Slovakia, 9–12 July 2017; pp. 183–191. [Google Scholar] [CrossRef]
- Drobny, D.; Weiss, M.; Borchers, J. Saltate!: A sensor-based system to support dance beginners. In Proceedings of the the 27th Annual CHI Conference on Human Factors in Computing Systems, Boston, MA, USA, 4–9 April 2009; pp. 3943–3948. [Google Scholar] [CrossRef]
- Romano, G.; Schneider, J.; Drachsler, H. Dancing Salsa with Machines—Filling the Gap of Dancing Learning Solutions. Sensors
**2019**, 19, 3661. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Ho, C.; Tsai, W.; Lin, K.; Chen, H.H. Extraction and alignment evaluation of motion beats for street dance. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Vancouver, BC, Canada, 26–31 May 2013; pp. 2429–2433. [Google Scholar] [CrossRef]
- Cornacchia, M.; Ozcan, K.; Zheng, Y.; Velipasalar, S. A Survey on Activity Detection and Classification Using Wearable Sensors. IEEE Sens. J.
**2017**, 17, 386–403. [Google Scholar] [CrossRef] - Lara, O.D.; Labrador, M.A. A Survey on Human Activity Recognition using Wearable Sensors. IEEE Commun. Surv. Tutor.
**2013**, 15, 1192–1209. [Google Scholar] [CrossRef] - Sousa Lima, W.; Souto, E.; El-Khatib, K.; Jalali, R.; Gama, J. Human Activity Recognition Using Inertial Sensors in a Smartphone: An Overview. Sensors
**2019**, 19, 3213. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Sprager, S.; Juric, M.B. Inertial Sensor-Based Gait Recognition: A Review. Sensors
**2015**, 15, 22089–22127. [Google Scholar] [CrossRef] [PubMed] - Paradiso, J.A.; Hsiao, K.; Benbasat, A.Y.; Teegarden, Z. Design and implementation of expressive footwear. IBM Syst. J.
**2000**, 39, 511–529. [Google Scholar] [CrossRef] [Green Version] - Aylward, R.; Lovell, S.D.; Paradiso, J.A. A Compact, Wireless, Wearable Sensor Network for Interactive Dance Ensembles. In Proceedings of the International Workshop on Wearable and Implantable Body Sensor Networks, Cambridge, MA, USA, 3–5 April 2006. [Google Scholar] [CrossRef] [Green Version]
- Hasan, M.; Shimamura, T.A. Fundamental Frequency Extraction Method Based on Windowless and Normalized Autocorrelation Functions. In Proceedings of the 6th WSEAS International Conference on Computer Engineering and Applications, and Proceedings of the 2012 American Conference on Applied Mathematics.
- Liu, D.J.; Lin, C.T. Fundamental frequency estimation based on the joint time-frequency analysis of harmonic spectral structure. IEEE Trans. Speech Audio Process.
**2001**, 9, 609–621. [Google Scholar] [CrossRef] - Ferreira, J.L.; Wu, Y.; Aarts, R.M. Enhancement of the Comb Filtering Selectivity Using Iterative Moving Average for Periodic Waveform and Harmonic Elimination. J. Healthc. Eng.
**2018**, 7901502. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Braun, S. The synchronous (time domain) average revisited. Mech. Syst. Signal Process.
**2011**, 25, 1087–1102. [Google Scholar] [CrossRef] - Eyben, F.; Schuller, B.; Reiter, S.; Rigoll, G. Wearable assistance for the ballroom-dance hobbyist holistic rhythm analyis and dance-style classification. In Proceedings of the 2007 IEEE International Conference on Multimedia and Expo, Beijing, China, 2–5 July 2007; pp. 92–95. [Google Scholar] [CrossRef]
- Mbientlab MMR. Available online: https://mbientlab.com/metamotionr/ (accessed on 21 September 2021).
- Alphabetical Jazz Steps 3. Available online: https://www.youtube.com/watch?v=jAIwJd2tQo0&list=PLpLDojUPSMvcYMA7jEFPidEbSD2-vNz8m (accessed on 5 April 2021).
- Stančin, S.; Tomažič, S. Time- and Computation-Efficient Calibration of MEMS 3D Accelerometers and Gyroscopes. Sensors
**2014**, 14, 14885–14915. [Google Scholar] [CrossRef] [PubMed] [Green Version]

**Figure 1.**Dance motion capture. We use a single wearable device, including a micro-electromechanical system (MEMS) 3D accelerometer sensor, attached just above the dancer’s right leg ankle. The microposition and orientation of the sensor are arbitrary.

**Figure 2.**Leg activation patterns in solo jazz dancing. The first row illustrates the music beats that the dancer follows and define the dance tempo. The remaining rows illustrate different examples of leg activation patterns. Periods in which one leg is dominant are color-separated from periods of the other leg’s domination. Different hatch fills emphasize the difference in the executed motion elements. Examples (

**a**–

**c**) illustrate patterns of leg change on every beat, every two beats, and every four beats, respectively, while example (

**d**) illustrates a more complex leg activation pattern.

**Figure 3.**Overall dance tempo estimation. The first and second row refer to the recreational dancer’s 100 and 200 bpm testing sequences, respectively. Images (

**a**,

**c**) depict in colour the 3D acceleration amplitude frequency spectrum. For both examples, the spectrum has a clearly distinguishable peak at the respective step frequency, (

**a**) 0.83 Hz for the 100 bpm tempo and (

**c**) 1.67 Hz for the 200 bpm tempo). Depicted in black colour is the magnitude response of the tuned comb filter. Both plots are normalized to fit into to [0,1] range. Images (

**b**,

**d**) present the output energy, calculated for the entire range of comb filters frequencies together with step frequency estimates.

**Figure 4.**Dance tempo estimation results for short dance excerpts. Each vertical, colored line additionally illustrates the duration of four consecutive dance moves, for each dance tempo respectively. For (

**a**) the professional dancer for excerpts of such duration, the results are at the level of 1 bpm accuracy for all tempos. For (

**b**) the recreational dancer, this holds only for the intermediate tempos, i.e., 100, 120, 140, and 160 bpm.

**Table 1.**Overall dance tempo estimation results. For all but one sequence, the estimated tempo matches perfectly the tempo dictated by the metronome. For the remaining sequence, i.e., the professional’s 120 bpm sequence, the absolute difference is 1 bpm, which is in the range of metronome beat onset stability.

Metronome Tempo (bpm) | Estimated Tempo (bpm) | Absolute Tempo Difference (bpm) | |
---|---|---|---|

Professional Dancer | Recreational Dancer | ||

80 | 80 | 80 | |

100 | 100 | 100 | |

120 | 121 | 120 | |

140 | 140 | 140 | |

160 | 160 | 160 | |

180 | 180 | 180 | |

200 | 200 | 200 | |

220 | 220 | 220 |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Stančin, S.; Tomažič, S.
Dance Tempo Estimation Using a Single Leg-Attached 3D Accelerometer. *Sensors* **2021**, *21*, 8066.
https://doi.org/10.3390/s21238066

**AMA Style**

Stančin S, Tomažič S.
Dance Tempo Estimation Using a Single Leg-Attached 3D Accelerometer. *Sensors*. 2021; 21(23):8066.
https://doi.org/10.3390/s21238066

**Chicago/Turabian Style**

Stančin, Sara, and Sašo Tomažič.
2021. "Dance Tempo Estimation Using a Single Leg-Attached 3D Accelerometer" *Sensors* 21, no. 23: 8066.
https://doi.org/10.3390/s21238066