Next Article in Journal
Instability of a Moving Bogie: Analysis of Vibrations and Possibility of Instability in Subcritical Velocity Range
Previous Article in Journal
Effect of Viscous Dampers with Variable Capacity on the Response of Steel Buildings
 
 
Article
Peer-Review Record

Rhythmic Analysis in Animal Communication, Speech, and Music: The Normalized Pairwise Variability Index Is a Summary Statistic of Rhythm Ratios

by Yannick Jadoul 1,*,†, Francesca D’Orazio 1,*,†, Vesta Eleuteri 2, Jelle van der Werff 1, Tommaso Tufarelli 3, Marco Gamba 4, Teresa Raimondi 1,‡ and Andrea Ravignani 1,5,6,‡
Reviewer 1:
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Submission received: 27 November 2024 / Revised: 10 March 2025 / Accepted: 15 March 2025 / Published: 24 March 2025

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

First, I commend the authors for their writing clarity. The article presents a didactic and convincing mathematical demonstration of the relation between two indexes to measure rhythmic complexity in a discrete signal. The mathematical material presented is very accessible, which is particularly useful for non-experts who want to use or simply understand these indexes. This contribution is sufficiently novel and original to be published as a full-fledged article. However, I would recommend that the authors address and clarify the following points to highlight the strengths and weaknesses of their approach.  

A central criticism is that the authors present this work as a contribution to the unification of disciplines that study rhythmic patterns. Although I agree that it is important to refine methods that quantify rhythmic complexity in a signal, I do not think that this informs us about general mechanisms governing rhythmic structures at the physical, biological or cognitive levels, and I would therefore tone down this claim throughout the manuscript. In fact, the results, discussion and conclusion sections barely come back to this goal. How the mathematical demonstration contribute to it is not explicitly discussed.

There is also a confusion between the physical properties of rhythmic patterns, their biological functions, and the higher-level, cognitive mechanisms that extract rhythmic information. For example, in the “1.3. Measuring Rhythm” section, the beat is introduced (“as the duration of the events themselves is irrelevant to the sense of ‘beat’ that the rhythm induces in the listener”). The sense of beat relies on internal, cognitive mechanisms that are not necessarily directly reflected in the stimulus signal.  

I also miss a section explaining that these quantitative measures assume that prior decisions are made on where the rhythmic events are supposed to be. I don’t agree with, or maybe don’t understand, the last paragraph of section “1.2. Describing Rhythm Across Disciplines” in this regard. Events considered as onsets in figure 1 are clearly defined in discrete signals like drums, but it is much less obvious in more continuous behavioral signals (e.g., language, music, birdsongs) and even less in biophysiological logical signals (e.g., neurophysiological signals), where signal processing is necessary to extract the temporal information (e.g., peaks of amplitude). Quantification of rhythmic complexity can even be performed at finer-grained temporal resolution at the signal processing stage, without prior extraction of events such peaks (e.g., with Recurrence Quantification Analyses or Phase coherence), but this is not the case here. This is particularly important as not all disciplines cited in the introduction (speech science, music cognition, and animal bioacoustics) study signal the same way, and the methods presented in this paper are relevant for very specific rhythmic patterns only.  

l.46 “A broad range of disciplines are interested in rhythmic behaviors, spanning wildly different spatial and temporal scales”. The sentence is awkward, it feels like it’s the range of disciplines that is spanning different spatial and temporal scales. I assume that it’s actually rhythmic behaviors.

I would be a little more cautious in the last paragraph of section titled “1.1. Rhythm in Speech, Music, and Bioacoustics”. I agree that rhythm is multidimensional, but I think a bit more credit should be given to researchers who tried to define and classify rhythmic signals or behaviors, which are in part acknowledged in the next section. Would you call the field of, say, Memory research, a definitional chaos? Memory is multidimensional and can be studied through various approaches, concept and theories all more or less connected to the general definition of memory. I think rhythm is similar. T Besides, I don’t see any specific contribution related to the definition of rhythm in this article.  

Equations should be numbered.

It would also help to explain the first two equations or at least explicitly say that they will be further described later in the text, as readers might spend time trying to understand them when presented first in this version.

Figure 2: why not include an example of the sequences of events that , such as the ones presented in figure 1a?

Typo l. 347: missing parenthesis

 

I look forward to reading your thoughts on the points I raised, hoping they make sense, and potentially discuss that further. Congratulations again on a great piece of work!   

 

Author Response

First, I commend the authors for their writing clarity. The article presents a didactic and convincing mathematical demonstration of the relation between two indexes to measure rhythmic complexity in a discrete signal. The mathematical material presented is very accessible, which is particularly useful for non-experts who want to use or simply understand these indexes. This contribution is sufficiently novel and original to be published as a full-fledged article. However, I would recommend that the authors address and clarify the following points to highlight the strengths and weaknesses of their approach.

We thank the reviewer for these kind words and useful suggestions. We are glad to read that the reviewer found the manuscript accessible, and believe our revisions to the reviewer’s comments have further improved the manuscript.

 

A central criticism is that the authors present this work as a contribution to the unification of disciplines that study rhythmic patterns. Although I agree that it is important to refine methods that quantify rhythmic complexity in a signal, I do not think that this informs us about general mechanisms governing rhythmic structures at the physical, biological or cognitive levels, and I would therefore tone down this claim throughout the manuscript. In fact, the results, discussion and conclusion sections barely come back to this goal. How the mathematical demonstration contribute to it is not explicitly discussed.

We thank the reviewer for pointing this out. It is indeed important to keep in mind the distinction between the signal and the underlying mechanisms. We did not intend to make strong claims regarding the correspondence between general mechanisms and rhythmic structure; rather, our aim is to shortly and succinctly put the use of rhythmic measures into a broader scientific context. After all, quantifying the differences and commonalities in rhythm between different fields is not the ultimate goal, but a way of getting more insight into these underlying mechanisms.

We hope to have put our findings better into context. In subsection 1.1 of the introduction, we now explicitly mention the study of the rhythmic structures in animal vocalizations/speech/music as only “one possible window” to study the possible underlying mechanisms (line 50).

Additionally, we included an additional paragraph in the manuscript’s conclusion to explicitly ensure our findings are framed correctly, following the reviewer’s suggestion (lines 414–417): “Our goal in this study was not to infer general mechanisms governing rhythmic structures at the physical, biological or cognitive levels. Rather, we established an explicit mathematical link between two commonly used quantitative tools. This, in turn, should serve toward building a shared quantitative toolkit for rhythm research.”

We have also gone over the whole manuscript and ensured that all mentions of “unifying” refer explicitly to the two measurements (nPVI and rk), and cannot be misunderstood as claims that our current work is unifying different methodologies or providing a unified cross-disciplinary approach to rhythmic analysis.

 

There is also a confusion between the physical properties of rhythmic patterns, their biological functions, and the higher-level, cognitive mechanisms that extract rhythmic information. For example, in the “1.3. Measuring Rhythm” section, the beat is introduced (“as the duration of the events themselves is irrelevant to the sense of ‘beat’ that the rhythm induces in the listener”). The sense of beat relies on internal, cognitive mechanisms that are not necessarily directly reflected in the stimulus signal.

Thank you. After rereading this paragraph, we agree with the reviewer and have revised this sentence. As our manuscript and the presented rhythmic measures are intended to capture the rhythmic patterns physically present in the signal, we have removed all references to ‘beat’, here (lines 113–117) and among the manuscript’s keywords.

 

I also miss a section explaining that these quantitative measures assume that prior decisions are made on where the rhythmic events are supposed to be. I don’t agree with, or maybe don’t understand, the last paragraph of section “1.2. Describing Rhythm Across Disciplines” in this regard. Events considered as onsets in figure 1 are clearly defined in discrete signals like drums, but it is much less obvious in more continuous behavioral signals (e.g., language, music, birdsongs) and even less in biophysiological logical signals (e.g., neurophysiological signals), where signal processing is necessary to extract the temporal information (e.g., peaks of amplitude). Quantification of rhythmic complexity can even be performed at finer-grained temporal resolution at the signal processing stage, without prior extraction of events such peaks (e.g., with Recurrence Quantification Analyses or Phase coherence), but this is not the case here. This is particularly important as not all disciplines cited in the introduction (speech science, music cognition, and animal bioacoustics) study signal the same way, and the methods presented in this paper are relevant for very specific rhythmic patterns only.

We agree this is an oversight in the original manuscript, and that it is important to specify more clearly how to go from empirical data to abstract temporal sequences. As this discretization can vary a lot depending on the type of data and research question, it is difficult to be both comprehensive and detailed. However, we have amended the manuscript by adding an extra paragraph outlining the typical process (lines 89–102). Moreover, we have highlighted the reference to the “Robust Rhythm Reporting Will Advance Ecological and Evolutionary Research” article by Hersh, Ravignani, and Burchardt (2023), which set up in detail a methodological framework for reporting rhythmic analyses and bridges this gap between temporal data and rhythmic analyses.

 

l.46 “A broad range of disciplines are interested in rhythmic behaviors, spanning wildly different spatial and temporal scales”. The sentence is awkward, it feels like it’s the range of disciplines that is spanning different spatial and temporal scales. I assume that it’s actually rhythmic behaviors.

We thank the reviewer for pointing out this grammatical ambiguity and have revised this sentence (line 46–47).

 

I would be a little more cautious in the last paragraph of section titled “1.1. Rhythm in Speech, Music, and Bioacoustics”. I agree that rhythm is multidimensional, but I think a bit more credit should be given to researchers who tried to define and classify rhythmic signals or behaviors, which are in part acknowledged in the next section. Would you call the field of, say, Memory research, a definitional chaos? Memory is multidimensional and can be studied through various approaches, concept and theories all more or less connected to the general definition of memory. I think rhythm is similar. Besides, I don’t see any specific contribution related to the definition of rhythm in this article.

Thank you. Rereading our manuscript, we agree that this paragraph comes across as too strong. We have toned down the language, and added several references to publications that have contributed to the various definitions of rhythm (lines 58–66).

 

Equations should be numbered.

Thank you for pointing this out. We agree that numbering all the non-inline equations improves the readability of our manuscript. In line with this suggestion, we have numbered all of the non-inline equations.

 

It would also help to explain the first two equations or at least explicitly say that they will be further described later in the text, as readers might spend time trying to understand them when presented first in this version.

Thank you for this fresh reader’s perspective and suggestions. In the interest of keeping the introduction as accessible as possible, we have not added the explanation in subsections 1.4 and 1.5. Instead, we have added two forward references to section 2.1, as suggested by the reviewer (lines 154 and 193).

 

Figure 2: why not include an example of the sequences of events that , such as the ones presented in figure 1a?

We thank Reviewer 1 very much for this suggestion! We have updated Figure 2 and agree that it now conveys the message much better. The caption of Figure 2 now also includes a discussion of these sequences (lines 280–284), making the concepts presented in the figure less abstract.

 

Typo l. 347: missing parenthesis

Thank you for pointing out this small mistake!

 

I look forward to reading your thoughts on the points I raised, hoping they make sense, and potentially discuss that further. Congratulations again on a great piece of work!

Reviewer 2 Report

Comments and Suggestions for Authors

The article makes a contribution to rhythm analysis by presenting an interesting theoretical link between the nPVI and RK measures. However, some aspects need to be developed to enhance the validity and practical application of the findings.
While the authors claim that "the duration of the events themselves is often irrelevant to the sense of 'beat' that the rhythm induces in the listener", this statement oversimplifies the complexity of rhythm perception. The way in which intervals between events are filled - whether with silence, static sound or evolving sound - profoundly influences rhythmic perception. Studies since the 1950s, including those on musical rhythms and other organised sounds such as bird song and machine noise, have demonstrated the importance of the content of the space between two articulations in the shaping of auditory perception. To strengthen the paper, I recommend that the authors clarify why they excluded inter-articulation content and how this choice fits with their objectives within this article.
But the major limitation of the article is its reliance on theoretical analysis without empirical validation. The specific sounds or signals used in the rhythmic analyses are not described, only general terms such as 'fish vocalizations' or 'singing primates' are used. A detailed description of the signals is essential to validate the findings and to ground the study beyond theoretical abstraction. This omission limits the ability to assess the applicability of the results to real-world scenarios.
The methodological framework should include specific guidelines for the application of the measures to real-world data.
Practical applications and data handling strategies should be more explicitly detailed to improve accessibility and replicability.
In terms of conclusions, the claim that the RK distribution provides a 'more complete view' (line 276) of rhythmic structure compared to nPVI is compelling but not fully substantiated. The inclusion of real-world examples of how rk distributions capture nuances missed by nPVI would significantly strengthen this claim. Furthermore, the practical implications of the study, particularly in noisy or complex datasets, should provide a clearer understanding of its real-world applicability.
Despite these comments, I consider this to be a very interesting article

Author Response

The article makes a contribution to rhythm analysis by presenting an interesting theoretical link between the nPVI and RK measures. However, some aspects need to be developed to enhance the validity and practical application of the findings.

We thank the reviewer for the detailed comments and suggestions for improvement. We have done our best to resolve all the issues raised, and believe that the manuscript has indeed been improved through this feedback.

 

While the authors claim that "the duration of the events themselves is often irrelevant to the sense of 'beat' that the rhythm induces in the listener", this statement oversimplifies the complexity of rhythm perception. The way in which intervals between events are filled - whether with silence, static sound or evolving sound - profoundly influences rhythmic perception. Studies since the 1950s, including those on musical rhythms and other organised sounds such as bird song and machine noise, have demonstrated the importance of the content of the space between two articulations in the shaping of auditory perception. To strengthen the paper, I recommend that the authors clarify why they excluded inter-articulation content and how this choice fits with their objectives within this article.

After rereading this paragraph, we agree with the reviewer that this sentence is too strong. Rather than presenting the duration as irrelevant, we are now more careful and state that “a typical assumption borrowed from music perception studies is that the events’ duration is less relevant for temporal structure than the intervals between their onsets [44-46]” (lines 113–115). Moreover, as we intend to stay away from any claims regarding perception, and focus on the rhythmic properties measurable in the signal itself, we have removed the references to ‘beat’ here and among the manuscript’s keywords.

 

But the major limitation of the article is its reliance on theoretical analysis without empirical validation. The specific sounds or signals used in the rhythmic analyses are not described, only general terms such as 'fish vocalizations' or 'singing primates' are used. A detailed description of the signals is essential to validate the findings and to ground the study beyond theoretical abstraction. This omission limits the ability to assess the applicability of the results to real-world scenarios.

We agree that this is an oversight and that the original manuscript made a somewhat large leap in going from empirical data to abstract temporal sequences. One reason for this is that this step is highly dependent on the type of data and research question, so we feel it is inappropriate to give strong advice here. However, we amended the manuscript by adding an extra paragraph outlining the typical process (lines 89–102). Moreover, we have highlighted the reference to the “Robust Rhythm Reporting Will Advance Ecological and Evolutionary Research” article by Hersh, Ravignani, and Burchardt (2023), which set up in detail a methodological framework for reporting rhythmic analyses and bridges this gap between temporal data and rhythmic analyses.

 

The methodological framework should include specific guidelines for the application of the measures to real-world data.

Practical applications and data handling strategies should be more explicitly detailed to improve accessibility and replicability.

For both of these points, we would also like to defer to the article by Hersh, Ravignani, and Burchardt (2023). The main focus of our manuscript is meant to be the relationship between the nPVI and rk measures, and not a full methodological framework. We do agree that the first version of the manuscript was somewhat unclear in this goal. As such, we had added a couple of sentences at the end of section 1.2, explicitly outlining the goal of section 1.3 (lines 103–106).

 

In terms of conclusions, the claim that the RK distribution provides a 'more complete view' (line 276) of rhythmic structure compared to nPVI is compelling but not fully substantiated. The inclusion of real-world examples of how rk distributions capture nuances missed by nPVI would significantly strengthen this claim. Furthermore, the practical implications of the study, particularly in noisy or complex datasets, should provide a clearer understanding of its real-world applicability.

We thank the reviewer for their suggestion. We agree with their interpretation of the results and subsequent call to action on applying these findings in real-world datasets. In the current manuscript, our intention is however to succinctly present the general relationship between the two rhythmic measures. We have extended the discussion of different sequences in the caption Figure 2 (lines 280–284). Based on the suggestions by Reviewer 1, we have added example sequences to Figure 2, and we believe these sequences help demonstrate the point the reviewer was trying to make: The example sequences show even more strongly how the actual underlying temporal data can be very qualitatively different, yet result in the same nPVI values. We hope that the reviewer and editor agree that the addition of a fully worked out example would be out of scope for the current article. Instead, by keeping the current article short and focussed, we intend ourselves and others to apply the suggestions in future work, within the context of a distinct research question and rhythmic analysis.

Despite these comments, I consider this to be a very interesting article

Reviewer 3 Report

Comments and Suggestions for Authors

The manuscript presents two methods for quantifying Rhythm: the normalized pairwise variability index and rhythm ratios. The former is a single numerical value, whereas the latter essentially represents a distribution. The authors have derived that the normalized pairwise variability index is closely related to the expectation of rhythm ratios (Line 230). The authors also suggest paying attention to the dispersive properties of the rhythm ratios distribution (such as standard deviation and coefficient of variation) to obtain characteristics distinct from the aggregative trend captured by the normalized pairwise variability index.

In my previous research on biological rhythms, the approach typically involves first testing for the presence of rhythmicity, then analyzing the rhythms using Fourier transform, and finally interpreting these rhythms based on biological significance. The two indices mentioned in the manuscript, normalized pairwise variability index and rhythm ratios, are new to me, and I found them quite informative. As I am unfamiliar with the methods for quantifying Rhythm discussed in the manuscript, I am unable to judge whether the research is sufficiently novel. However, I can confirm that the mathematical derivations presented in the manuscript are accurate.

A minor suggestion for the manuscript: Please number all the equations to facilitate their citation within the text. Even equations not directly cited in the main text should be numbered for clarity.

Author Response

The manuscript presents two methods for quantifying Rhythm: the normalized pairwise variability index and rhythm ratios. The former is a single numerical value, whereas the latter essentially represents a distribution. The authors have derived that the normalized pairwise variability index is closely related to the expectation of rhythm ratios (Line 230). The authors also suggest paying attention to the dispersive properties of the rhythm ratios distribution (such as standard deviation and coefficient of variation) to obtain characteristics distinct from the aggregative trend captured by the normalized pairwise variability index.

We thank the reviewer for their clear and succinct summary of our manuscript, and are glad that our main message was clear.

 

In my previous research on biological rhythms, the approach typically involves first testing for the presence of rhythmicity, then analyzing the rhythms using Fourier transform, and finally interpreting these rhythms based on biological significance. The two indices mentioned in the manuscript, normalized pairwise variability index and rhythm ratios, are new to me, and I found them quite informative. As I am unfamiliar with the methods for quantifying Rhythm discussed in the manuscript, I am unable to judge whether the research is sufficiently novel. However, I can confirm that the mathematical derivations presented in the manuscript are accurate.

We thank the reviewer for verifying the mathematical derivations, and for providing us with an additional perspective on rhythmic analyses. We agree that the Fourier transform is a fundamental tool to analyze temporal sequences, and now also briefly mention this in the discussion (lines 390–393).

 

A minor suggestion for the manuscript: Please number all the equations to facilitate their citation within the text. Even equations not directly cited in the main text should be numbered for clarity.

Thank you. We agree that numbering all the non-inline equations improves the readability of our manuscript. In line with this suggestion, we have numbered all of the non-inline equations.

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

Thank you for addressing the different points I raised. Again, congratulation on a nice piece of work that will be helpful for researchers in the field. 

Best regards,

Valentin Bégel

Author Response

Thank you very much for the positive assessment of our revised manuscript!

We also wish to thank all reviewers again for their constructive comments and improvements to the manuscript.

Back to TopTop