4.2. Discussion
This study provides insights into the dynamics of ME and its relationship with dominance, causality, correlation, speech, and gesture modalities in dialogues. By analysing a multimodal dataset, the movements of interlocutors are examined in relation to other modalities, investigating how these dynamics associate with perceived dominance and how often the more dominant participant leads the discussion. The findings unfold complex temporal patterns of non-verbal communication and behavioural coordination. Recall that independent observations of which participant is more dominant than the other in each of the dialogues considered provides a separate indication of which party “leads” the interaction. It has been shown that the labeling of windows according to ME GC conditions is compatible with these independent observations (which this work noted were also potentially influenced by spoken content).
Synchronisation is an indication of temporal alignment and simple lagged correlations of ME capture aspects of temporal alignment in ME, while Granger causation provides insights into the directional influence between interlocutors. Positive correlations are often associated with cooperative alignment, whereas negative correlations are complementary and might indicate moments of changes or transitions between roles such as leading and speaking. The results reported in
Section 4.1.1 demonstrate that lagged correlation and GC are related but distinct phenomena. Of course, this is an analytical truth, based on their definitions; however, here this study demonstrates the use and interactions of both methods in classifying ME synchronisation. These findings suggest that correlation and causation are interdependent in conversation, with the effect of one influenced by the level of the other, regardless of whether the MD or LD interlocutor initiates movement. Specifically, analysis using GC reveals that the ME of MD interlocutors frequently Granger-cause the ME of LD interlocutors across multiple segments of time, demonstrating a relationship between dominance perception and leading behaviours. This supports the hypothesis that interlocutors perceived as dominant often lead during conversations.
However, the LD interlocutor also exhibits some moments of influence, particularly during cooperative alignment–positive correlation (Pos-LD causes MD when MD precedes) and speaking (see
Table 3). This can suggest bidirectional causation; while one interlocutor may be perceived as MD in a conversation, they do not exclusively control or direct the interaction all the time. Instead, both interlocutors can influence the dialogue, even if one is generally MD and influences more. These findings align with prior research emphasising mutual adaptation in interpersonal coordination and also studies on alternating leadership dynamics draw attention to the fact that while dominance influences leadership frequency, conversational leadership is dynamic and bidirectional, and this role changes over time. Negative correlation during moments when MD interlocutors precede (Neg-MD Causes LD) and when MD interlocutors respond to LD interlocutors (Neg-LD Causes MD) reflects moments when the influence from MD is aligned with the LD’s movements but in the opposite direction (complementary alignment). This can refer to challenges in maintaining synchronisation. On the other hand, this complementary alignment may indicate a transition between the interlocutors. Not only positive correlations can show leadership, but also negative correlations might show the leader adjusting their ME—either reducing or increasing it—to allow their interlocutor to follow, effectively managing the conversation. This suggests that negative correlation could be part of a deliberate strategy to transition roles and maintain the leading process.
The incorporation of gesture and speech modalities highlights their significant roles in ME synchronisation and causation dynamics. When MD interlocutors gesture, their influence on the LD interlocutor’s ME is significantly increased. This suggests that gestures promote dominance (perhaps due to their conveying additional information beyond the linguistic content; perhaps due to their mere occupation of the visual field), and shape the conversation and as a result, attract attention of observers. Results show that leading behaviour is often accompanied by specific gesture types, such as Beats and Iconic gestures. For example, MD interlocutors frequently employ Beat gestures during windows classified as “MD Causes LD”—the nonverbal modality evidently strengthens their leading role. Iconic and Beat gestures employed by the leader can emphasise their role in emphasising and structuring the conversation. Beat gestures, in particular, help emphasise rhythm and alignment in communication, while Iconic gestures convey meaning and empower the leader’s role in managing the dialogue. Whether an interlocutor takes up such a gesture is not addressed here, even though the implication of the label “MD Causes LD” in any window is that the ME of LD is influenced by the ME of MD and that includes the ME of the gesture itself. This can be partly assessed using distributions of gesture sequence labelings corresponding to the interlocutors’ individual and joint activity—this is a separate set of investigations.
We explore how the temporal flow of ME values relates to categories of interaction through regression analysis (see
Appendix A). Regression analyses show that causation, correlation, gestures, and speaking modalities affect variability in ME across segments of time captured by each window. ME variability increased by adding these conversational features to the regression model as predictors. This proves the need to consider multiple interacting factors to fully grasp the complexity of conversational dynamics. However, regression models do not show any ME pattern shaped by dominance, referring to the complexity of dynamics in human–human interactions, shaped by interlocutors’ roles and multimodal behaviours (detailed in
Appendix A.1).
The analysis of interaction differences between categories provided further evidence of relationships among causation, correlation, and speech modality. Pairwise comparisons show contrasts on ME depend on GC categories, particularly during speaking and alignment moments—positive and negative correlation (see
Table A6 and
Table A7). When ME data of MD interlocutors are lagged to precede LD, and correlation is significant, and MD speaks, ME values were more significantly different, indicating a stronger influence of MD interlocutors during windows that are either positively or negatively correlated in interlocutors’ ME values, particularly positive correlation. This contrast of ME cannot be seen where LD participants precede. Speaking moments consistently show higher ME than silent ones, emphasising the importance of vocal communication and validate that speaking increases motion energy compared to silent moments. During alignment, MD interlocutors have stronger influences on ME when they lead, while LD interlocutors have a more limited impact.
When examining a dataset and intending to address the question of who is leading the conversation on whatever measurement (here we measure ME) for any particular stretch of time (such as we have characterised with windows), it is necessary to examine the data from the perspective of lagged values. This creates the question of whether simple lagged correlations are adequate or whether, instead, GC provides a clearer picture. Here, we used GC on the same lagged data. For arbitrary conversations, it is not clear which party should be taken as preceding the other. This leads to examining both forms for the dyadic conversations we have studied. In these data, there is a justification for focusing on only one of the two parties in that independent assessments determine that in each, one of the interlocutors is MD than the other. Further, many of the results reported have involved similar effects, if differencing magnitudes, regardless of whether the ME values of the MD party are lagged ahead of those of the LD, or the other way around.
A next step in interpreting these results involves additional analysis, equating windows between the two lagged versions of the data in an ordinal manner (by virtue of the lags, they do not contain identical values). One may then consider the agreement on labels for the
ith window between the two views of the data—whether and to what extent the two laggings agree on labels for the
ith window, using both the GC labels and the lagged correlation labels. For example, the first window in the MD precedes LD ME lagged data may be given the lagged correlation label “open” and the first window in the LD precedes MD data may have the label “open”, too; or, the first window may have different labels between the two lagged views of the data; and so on, for each successive window, and also in relation to the GC labels. Given the lagged correlation labels for each of the 28,266 windows in the data, the “confusion” matrix depicted in
Table 6 emerges. To assess the degree of association between the categorisation schemes, we see that Cramér’s
, a weak association. To clarify this further may arbitrarily assign one lagged precedence categorisation scheme as a reference (LD precedes) and compute the “accuracy” of the other (MD precedes). A majority classifier (always assign “Open” correlation) would have an overall accuracy measure of 0.87; here, the overall accuracy score is 0.7998. Thus, a majority classifier would produce more agreement than using the labels of one lagged order of the data to guess the labels for the windows in the other lagged order.
Table 7 shows the resulting precision, recall, and F1 scores, by category (note that if the reference categorisations were switched, so, too, would values for precision and recall across the categories). However, it can be demonstrated using the Pearson residuals from a Chi-squared test of the contingency
Table 6 that significant agreement falls exactly along the diagonal, where the largest positive magnitudes in any row/column are witnessed (
)—see
Table 8. This entails that for the labeling of the
ith window as computed for the MD precedes LD lagged ME data and for the LD precedes MD lagged ME data, there is significant divergence, and also, there is significant agreement.
Similar reasoning applies to the labels provided through GC tests of windows. Here, the association is even less strong (Cramér’s
). Again, arbitrarily holding the LD preceding MD lagged data as the source of “true” reference labels, we analyse the “accuracy” of the labels that are computed for MD preceding windows. A majority classifier assigns “No Causation” with an accuracy score of 0.86; here, overall accuracy is 0.7769.
Table 9 shows the precision, recall, and F1 scores, by category. The Bi-Causation label yields the worst “performance”. Finally, based on of the contingency table given in (
Table 10) we use the adjusted standard residuals (
Table 11) from a Chi-squared test (
) to explore the diagonal of the cross-classification. We note that a Bonferroni adjustment to
, provides
, and a critical value of 2.96 for significance. The largest positive magnitudes in the residuals (note that all of the residuals are significant) for any row and column fall along the diagonal, except for the Bi-Causation label. Where there is agreement between two classifications, the (significant) residuals along the diagonal being positive is indicative of the instances of observed agreement being significantly greater than would be expected if there were no relationship between the label for windows according to the two classification schemes.
The comparison of agreement on correlation labeling of windows and on GC labeling of windows depending on whether the MD precedence or LD precedence view of the data is chosen indicates that the precedence selected is an important decision to make if only one precedence is to be explored. With the GC labels, one might have anticipated that the LD causes MD labels on the LD preceding lags to mirror the MD causes LD labels on the MD preceding lags; however, this is not observed. Nonetheless, for all the disagreements on labels applied to windows between the two lag precedence possibilities, we note significant agreement, as well.
In relation to methods and next steps, this points to the possibility of taking an intersective approach, analysing the interactions reported here throughout (relations to speaking or not, gesture types, and specific interactions with levels of ME) with a focus on the windows according to the ordinal sequence where the two precedence views lead to agreement. It also suggests more detailed examination of the complement, the windows where the distinct lag precedence possibilities and the labeling computations lead to disagreement over labels.
These findings and the relationship between causation, correlation, gesture and speaking reveal aspects of the dynamic of leadership and alignment of bodily motion in cooperative dyadic conversations. We also see emergent recommendations with respect to methods of computing motion energy correlation and GC labels for temporal windows within the lagged data.