4.1. Comparing the Attributes between Data Sets
One way of comparing the years evaluated in the study is to generate trend graphs between the years 2017 and 2018. When analyzing the graphs, it is possible to observe attributes with a similar trend (where the values of the attributes of one year follow the ones of the other), while others where this behavior is not observed and the graphs have random behaviors. However, another behavior, where, at some point (throughout the presentations), the values of the attributes are very close, can be seen in Hands on Face. These results are shown in
Figure 4.
The first group of attributes (similar values of evolution throughout the presentations) have a very clear similarity and can be observed in the chart of talked and Cross Arms, for example,
Figure 4a,b. In this group, the behavior is the same in the three presentations of the semester; if one value of the variable falls in one year in the second presentation, it also falls in the same presentation of the other year. It should be noted that this decrease or increase in attribute value varies between years and in intensity.
The other group of attributes are those of differences in behavior throughout the presentations, such as the downside,
Figure 4c. The values of time percentages are different between years, and there is apparently no similarity of behavior between the evaluated years. What can be noticed is that the values are close between the years, even in this group of attributes where the trend that was shown previously is not observed. It is still possible to observe attributes where, at some point in the presentations, they obtained a similar trend, but that diverged in another point in the semester, as in the attribute Hands Down,
Figure 4d.
In order to better compare the databases, some inferential statistics metrics were used. First, we analyzed the null hypothesis that the attributes come from a standard normal distribution [
41]. The results showed that none of the attributes have normal distribution for those two years. Thus, it was necessary to use a non-parametric method to test the null hypothesis that data and two different databases are samples from continuous distributions with equal medians.
The Wilcoxon rank sum test [
42] is a non-parametric test for two populations, when the samples are independent. The Wilcoxon rank sum test is equivalent to the Mann–Whitney U-test. For this work, the
p-value was used to indicate the equality of the two population medians, at a 5% level significance. High values of
p indicate that there is not enough evidence to reject the null hypothesis. Low values of
p indicate otherwise.
Table 5 presents the tests for each pair of two presentations (databases), thus combining all possibilities, totaling 15 comparisons between databases. To facilitate the analysis, the comparisons between the presentations of different moments, of different years, were excluded. Consequently, the table shows only comparisons between the same years and comparisons between the same moments of different years. In order to make the reading of the table cleaner, the values considered small were rounded to 0. As can be seen, some combinations obtained low values of
p. This indicates that, in these attributes, populations have statistically different medians. The values that are in bold represent the results that have statistically different medians.
In the year 2017, the medians obtained statistically similar values. In comparison of the first presentation with the second one, for example, only one attribute was statistically different. For the comparison between the first and third presentations, we had 0 attributes statistically different. However, in the last comparison, between the second and third presentations, two attributes tested as different. Still, for the year 2017, only Downside two tests did not present similarities among the medians.
Already for the presentations of 2018, more attributes presented significant differences. The first comparison, between presentations one and two, obtained three attributes considered statistically different; compared to 2017, this can be considered a similar result, since we had 12 attributes to evaluate. The comparison between the first and third presentations had one attribute with different medians. In the last comparison of 2018, between the second and third presentations, we had 0 tested statistically different medians. In 2018, Downside two tests again presented a statistically different median.
Comparing the two years with the presentations at the same moments (moments in sequence), it is possible that, during the presentations, the attributes differences were reduced from 6 to 1. In the comparison with the first presentation of the two years (2017 and 2018), there were more statistical differences of the attributes, with 6. The first presentations are those that contain more observations in the databases, which can generate some variation.
As mentioned, in the comparison of the first presentations, we have a total of six statistically different attributes. We can visualize that, in the comparison between the second presentations for each year, the value falls from 6 to 3, an indication that the students present (in relation to the evaluated attributes) more similarity between the years. Finally, in comparison with the last presentations, we see that the different attributes statistically fall to 1. A new fall in the total number of different attributes between the years, which can indicate that, in the end, the students presented in a more similar pattern with respect to the two years.
4.3. Comparison among Clusters - Centroid Analysis
In the presentations of the year 2017, based on both the averages’ table and the visual analysis, it was decided to analyze the values of
k = 3. It is important to make clear that this value of
k was also chosen after an initial analysis of the centroids. This was done with the intention of avoiding the high values that the Silhouettes also presented, with little separation between the centroids’ values of the groups. Thus, this high average value might seem like a good value choice for clusters. An example of this behavior can be seen in the value of
k = 4 for presentation 3, in year 2017,
Figure 5i.
For the year 2018, there was a greater variation in the values of
k. The value of
k = 3 was chosen again. This value (
k = 3) presents a good clustering, as can be seen in
Figure 6d–f. The average value of the silhouettes does not present such high values, but to compare with presentations for the year 2017, it is important to choose the same value of
k for the two years.
For this work, the value of
k = 3 was chosen first because it presents good average values of silhouettes, and also presents a homogeneous formation of clusters, as can be seen in the Silhouettes. Another reason that led to this choice was the fact that there are three distinct patterns, when looking at the centroids. Choosing, for example,
k = 2, as the number of clusters, would cause one group of students to be inside another cluster. The centroids analysis for
k = 3 shows that there are subtle but important differences in the learning context that need to be explored. This section will present an analysis of the centroids’ values, between the different presentations at the determined moments. In this step, as previously mentioned, the centroids already generated were evaluated. The objective of this evaluation step is to find something related to the evolution or behavior of each cluster generated based on its attributes.
Figure 7 shows some attributes (eight attributes) in a polar graph. We chose not plot the other four attributes given the fact they are not present in the graph since their centroid values were too small.
Some attributes stand out more and others less, in certain clusters formed. The work of [
45] presents some characteristics of excellent performances during a presentation, and other characteristics of performances called poor. These characteristics can be related to the attributes evaluated in the present work, thus making clear the behavior of the clusters.
The characteristics that are considered poor are related to withdrawn body language, such as: defensive arm positioning (folded arms, hands in pockets), withdrawn posture and head down. The characteristics considered as excellent presentations by the authors are: open body posture, hand gestures to emphasize points or convey meaning and inclusive eye contact. In the work, other characteristics are pointed out, those that are mentioned here are the ones that more relate with the attributes evaluated in the present work. We can also associate speech with something positive in a presentation, since not talking indicates, in many situations, that the student explained less about the topic.
4.3.1. 2017 Presentations
In the first presentation of 2017, some attributes differ in relation to the clusters. In this analysis, some of these are analyzed, being chosen because of their importance in the behavior of the clusters. The
CrossArms attribute, for example, presents a significant difference between the three clusters. These types of attributes are highly relevant when the students communicate. This is because, for example, crossed arms are a typical defensiveness gesture [
3]. There is a cluster, group A, of observations that is more with this posture, another that is less, group C, and still another group of observations (group B) that is in the middle between the other two. For this presentation, we still have the
Talked attribute. This presents the same characteristic of the previous attribute, where there are three distinct ‘markings’ in the three clusters: one with a higher value, another as a middle ground and a smaller one.
The watching public attribute behaves differently from the other two. One cluster has a different understanding compared to the other two, which have almost 100% of the presentation time. The same pattern, not for the same clusters, is found in the point attribute. Two clusters (A and C) have low values, and one (group B) has higher values. Most of the other attributes do not clearly separate the clusters. The
OpenHands variable, for example, is practically the same in all three clusters. Some attributes have little variation, such as
Straight, this attribute being important as it indicates that the student is not with his head down, which is considered a bad behavior in oral presentations [
45]. Another attribute that separates the three clusters is
HandsDown. Three clear patterns can be seen in
Figure 7a.
In the second presentation, the general behavior remained similar to the previous one. For example, we have that the Watching public attribute has matched practically the same behavior. In this presentation, it is also possible to identify a group, group C, with the Talked attribute with the highest value (32% of presentation total time). However, it may also be noted that the Talked attribute has closer values between two clusters than it did in the first presentation. You can also notice a separation in the CrossArms attribute, with three very clear groups. In this presentation, the attribute HandsDown also appreciated a variation between three behaviors.
Finally, in the third presentation of this year, a new behavior, similar to the previous ones, were observed. On the other hand, the
Talked attribute presents two groups (A and B) with high values (around 30%); this may indicate that the presenters converged to a more speech behavior over the time (year) that they were presenting. These same two groups still have values close to the
Point attribute (approximately 10% for both). The
Point attribute. The attribute
Point is relevant in oral presentations with the aid of PowerPoints, as pointing can connect the presenter’s ideas [
46]. There is also another group that presents low values for these attributes, (
Talked and
Point).
4.3.2. 2018 Presentations
The attributes of 2018, for the first presentation, behaved slightly differently than in 2017. First, it can be assumed that all groups have the Talked attribute with values less than 10% of the total presentations. The Point attribute, like HandsDown, has a separation between the three chosen clusters. For the CrossArms posture, this separation behavior in the three groups could also be found. In general, this first presentation does not clearly mark three different behaviors. Apparently, the clusters are distributed over the attributes more randomly than in the year 2017.
In 2018, the values of the attributes among the three clusters are distributed in a scale. Group A had higher values, group C values were slightly smaller, and group B had values smaller than the previous two. Thus, one can see from
Figure 7d that the first presentation of 2018 had its clusters forming something resembling a scale. This behavior is different from what has been presented so far (the clusters varied according to the centroids within the presentations).
The initial behavior of some randomness of data and behaviors in the first presentation does not occur in the second presentation. When analyzing the data values, it can be observed that the groups are formed with a “rule”, or behavior, much like they were formed in 2017—this behavior being a cluster with a higher value in the attribute, another cluster with an intermediate value, and a third with a lower value than the previous two.
The last presentation of 2018 behaves differently from the rest of that year, as shown in
Figure 7f. First, it can be observed from the graph that the centroid values decrease, with respect to the previous presentations. Another point to be observed is that the clusters formed, changing their general rule. For example, we have that the
Watching public attribute no longer has two clusters with high values and one with low value. In this presentation, two clusters have low values, groups B and C. In general, in 2018, we can not observe the same pattern that happens in 2017; the rule of formation of clusters is not the same, principally, in the last presentation of the year 2018.
4.3.3. Comparison among Years
It is difficult to point out which are the same groups (if any) throughout the year, between presentations. In this sense, the comparison made in the present work will be to identify the way traveled throughout the year, and especially a comparison between the last presentations of each year. With this, we can identify some pattern that was formed during the semester. The results of
Table 5 shows that, at the end of each semester, the number of statistically different attributes drops from 6 to 1.
The 2017 year achieved some constancy in patterns. Presentations varied in centroid values; however, the polar chart helped find three behaviors in the three presentations. This behavior can not be seen for the year 2018, where the same behavior (or similarity) between presentations is not evident.
What can be seen in both years (2017 and 2018) is that there is a convergence of some attributes with greater values, in the last presentation. For example, it can be seen that Group A, in two years, has a similar value for the CrossArms variable. In the same way, Group B has attributes with similar values, such as Watching public and CrossArms. Group C presents some variation between the presentations, but it is possible to perceive a similarity between the presentations of the two years.
After all these results, it can be said that three distinct groups were found in the 2017 presentations; this is easy to observe in
Figure 7. For 2018, presentations of the same patterns as in 2017 were not observed. For this year, it became more difficult to separate the clusters. They were called Group A, Group B and Group C. This choice was made after an analysis of the Silhouettes and their means. We also took into account the centroid values of these clusters (
k = 2 returns a better value of silhouette mean), where a clear separation between these three clusters was observed. Larger numbers of clusters (
k = 5 in the first presentation of 2018) were also possible in specific cases (
Table 6). However, for all the other presentations, the average Silhouette value was better at
k = 3. This can be called a similarity between the years. Similar to the [
47], our study was concerned about finding patterns in student interactions in complex learning environments—in this case, presentations. This process can help teachers and educators with an easy and efficient way to quickly analyze students’ body postures.
There are general similarities between the data collected from the years 2017 and 2018.
Table 3 shows the similar behavior between the two years for mean values.
Figure 4 shows the similarities in attributes evaluated. In particular, at the beginning of the semester, it was noted that the presentations vary. This can be explained by the number of observations found in the first presentation of 2018 (59 observations), a much larger number than in 2017 (40 observations). The nonparametric Wilcoxon rank sum test helps to note this difference. In
Table 5, the test (comparison) that had more significant occurrences of attributes was between the first presentations of 2017 and 2018 (six attributes). It can also be seen that, at the end, with the comparison between the last two presentations, the number of these attributes falls to 1. This indicates that there is a greater statistical equality between the students at the end of the presentations.
Another point of the study to be highlighted is the importance of the attribute
Talked. Due to our experimental setting, this variable can more easily identify certain patterns or even a cluster within all the dataset. Still, the
Talked can highlight the other characteristics (attributes) of students, since good communicators must also have skills with body gestures [
3].