Ranking Features on Psychological Dynamics of Cooperative Team Work through Bayesian Networks

The aim of this study is to rank some features that characterize the psychological dynamics of cooperative team work in order to determine priorities for interventions and formation: leading positive feedback, cooperative manager and collaborative manager features. From a dataset of 20 cooperative sport teams (403 soccer players), the characteristics of the prototypical sports teams are studied using an average Bayesian network (BN) and two special types of BNs, the Bayesian classifiers: naive Bayes (NB) and tree augmented naive Bayes (TAN). BNs are selected as they are able to produce probability estimates rather than predictions. BN results show that the antecessors (the “top” features ranked) are the team members’ expectations and their attraction to the social aspects of the task. The main node is formed by the cooperative behaviors, the consequences ranked at the BN bottom (ratified by the TAN trees and the instantiations made), the roles assigned to the members and their survival inside the same team. These results should help managers to determine contents and priorities when they have to face team-building actions.


Introduction
In high performance sports, it is crucial to rank features to determine their degree of influence in a cooperative work team, with the ultimate objective of maximizing the team's performance.Along these lines, several studies have been carried out in the field of social psychology, where features, such as motivation, cohesion, leading positive feedback or social attraction, play an important role due to their close relation to performance [1][2][3][4][5].
Cooperative team work is a fact well known today [6], even when taking into account the associated psychological factors' [7] analysis.However, it is difficult to find studies that relate the equipment they need to cooperate with effective performance [8].Some psychological factors of collaborative teams have been studied more than others in terms of performance, such as cohesion [9,10] or group facilitators and blockers' roles [11].Features, such as the specificity of the jobs [12] (or playing positions, in this case), generating performance expectations, the motivational climate generated by the manager [13] and the leadership styles of the coach, which are not usually studied, are considered here in this paper.
Bayesian networks (BNs) [14][15][16][17][18], well suited to reason with uncertain domain knowledge, can be applied to aid teams by providing cooperative and collaborative work characterization estimates.BNs have been proven to be a strong tool to discover the relationships between variables that attempt to separate out indirect from direct association [8,[19][20][21] and can capture the way an expert understands the relationships among all of the features [22].They enable human experts to better understand the modeled domain.Contrary to deterministic understanding of the causality phenomenon [23], BN modeling lies within the data mining and machine learning literature [24,25].The network structure is a directed acyclic graph (DAG) where each node represents a random variable [26,27], and the arcs may represent causality [28][29][30].BNs combine graph theory and probability theory to represent relationships between variables (nodes in the graph) [8,19].They give a compact representation of a joint probability distribution via conditional independence.The major advantage of BN is the ability to represent and, hence, understand knowledge.BNs are the best known classifiers that are able to provide the probability distributions concisely and comprehensibly [31,32].In [33], the authors considered BN model with the naive Bayes algorithm as one of the most effective classification algorithms.
Recently, there has been increasing attention regarding the application of BNs in competitive sport contexts.They have been used to analyze team performance related to different types of cooperation between team members and various motivational climates generated by teams' coaches in sports performance [20,34], achieving different results from those obtained by classical statistics.
Based on the dataset obtained by our experimental study, the main objective of this work is to rank the features that characterize the leading positive feedback feature, the cooperative manager feature and the collaborative manager feature through an average BN, while comparing its performance with two special types of BNs, the Bayesian classifiers: naive Bayes (NB) and tree augmented naive Bayes (TAN).In the NB classifier, the class attribute is the single parent of each node of an NB network.The TAN classifier models relationships among the features of at most order one.The experimental results show that our average BN and the TAN models have similar performances on the classification of the leading positive feature, the cooperative manager feature and the collaborative manager feature, with the highest accuracy by the BN, 90%, 85% and 87%, respectively, closely followed by the TAN classifier, 90%, 84% and 87%, respectively.
The paper is organized as follows.Section 2 presents the materials and methods given the process to obtain the different cooperative team models.Section 3 presents the results of the leading positive feedback, cooperative and collaborative manager models through a BN.Section 4 presents some discussion.Finally, Section 5 concludes the paper.

Participants
The participants were male semi-professional football players from 20 teams who participated in the Third Division of a Spanish soccer League.A sample of 403 between the ages of 18 and 39 years with a median age of 24 years (IQR = 27-21.5)was used.

Procedure
The data were collected at the middle of the regular competitive season, when the majority of teams' dynamic processes are well engaged.All participants are legal adults, and their participation was voluntary; the authors obtained written consent from each participant.Participants completed the questionnaires in the changing room before a training session.

Leadership Behaviors
Coach leadership behaviors were assessed using an adapted Spanish version of the Leadership Sport Scale (LSS) [35].This is a 40-item instrument designed to measure the following five dimensions of leadership: two of them measure decision-making style (democratic behaviors, autocratic behaviors); motivational tendencies are measured by the other two dimensions (social support and positive feedback).Additionally, the last one measures instruction behavior (training and instruction).Responses were rated on a 5-point scale ranging from strongly disagree (1) to strongly agree (5).

Group Environment Questionnaire
The Spanish version of the GEQ [36,37] was used to assess team cohesion.This inventory of 18 items comprises the following four factors: the group integration-task, which refers to an individual's beliefs about team closeness, similarity and bonding around the group's task; group integration-social, which refers to a team member's sense of group closeness, similarity and bonding as a social unit; individual attraction to the group-task, which refers to a group member's feelings about personal involvement in relation to shared group goals and productivity; and the individual attraction to the group-social, which refers to the team member's impressions of social interactions and personal acceptance within the group.These four factors are grouped into the two main factors of social cohesion and task cohesion.Responses were rated on a 5-point scale ranging from strongly disagree (1) to strongly agree (5).

Work Experience and Expectations
Work experience and expectations were measured trough an ad hoc questionnaire containing the following questions, from "would have to play a lot less" to "would have to play a lot more".1. Work experience: "How many times have you played in this team?". 2. Expectations: we asked players and managers to predict what position they believed they would occupy in the standings at the end of the season.

Specificity Workplace
The positions of the players on the pitch were divided into 4 groups: goalkeeper, defender, midfielder and striker.

Bayesian Networks Modeling
The BNs' representation is a DAG, of which the nodes are the random variables in our domain, and the edges may correspond to the direct influence of one node on another.They capture the dependencies and conditional independencies, which are represented by a joint probability distribution.BNs are not restricted to representing distributions satisfying the independence assumptions, which is implicit in the NB and TAN models.
To obtain a BN, it is necessary to determine a structure defined by a DAG and the conditional probabilities assigned to each node of the DAG.Therefore, to learn a BN implies two tasks: (i) structural learning, that is the identification of the topology of the BN; and (ii) parametric learning, that is the estimation of numerical parameters (conditional probabilities) given a network topology.
One important characteristic of BNs is the global Markov property, which states that any node Xis conditionally independent of any other node given its Markov blanket, i.e., I(X, non − Markov − blanket(X) | Markov − blanket(X i )); the Markov blanket of a node includes its parents, its children and the children's other parents (spouses).The global Markov property has been applied to select the features with influence on the leading positive feedback, cooperative manager and collaborative manager features.
An average BN and two specific classifiers among all of the different Bayesian classifiers, NB and TAN, have been selected to study the leading positive feedback, cooperative manager and collaborative manager features.
Learning a BN from data is a form of unsupervised learning, in the sense that the learner does not distinguish the class variable from the attribute variables in the data [38].NB and TAN classifiers are special types of BN, where a supervised learning is performed.
A Bayes classifier assigns the most probable a posteriori (MAP) class to a given instance x i = (x i1 , . . ., x in ), as is shown in Equation (1): which is optimal in terms of minimizing the conditional risk under a symmetric loss function [39], where BN classifiers [38] approximate P(x, c j ) with a factorization according to the DAG structure of a BN [40].A Bayesian classifier applies Bayes' theorem (see Equation ( 2)): The probability that a sample with characteristics x i belongs to a class c j is denoted by P(c j |x i ), i.e., the posterior probability.The prior probability denoted by P(c j ) is the probability that a sample belongs to a class c j given no information on its characteristic values.

Conditional Independence of Triplets of Random Variables
Let us consider X, Y, Z sets of random variables (features in the dataset).It is said that X is conditionally independent of Y given Z in a distribution P, if P satisfies P(X|Y, Z) = P(X|Z).

Bayesian Network Model
BN models estimate the joint probability distribution P over a vector of random variables X = (X 1 , . . ., X n ).The joint probability distribution factorized as a product of several conditional distributions denotes the dependency/independency structure by a DAG: Equation ( 3) (where Pa(X G i ) denotes the parent nodes of X i ) is the main reason for the formulation of a multivariate distribution by BNs; this equation is also called the chain rule for Bayesian networks.
In order to obtain the DAG, we used the bnlearn package [43] of the R language [44].To obtain the structure, two options either select a single best model or obtain some average model, which is known as model averaging [45].Our model was learned by tabu algorithm, which explore the search space starting from a network structure and adding, deleting or reversing one arc at a time until the score can no longer be improved.The final model was obtained by repeating several times the structure learning; a large number of network structures was explored (1000 BNs) to reduce the impact of locally optimal (but globally suboptimal) network learning.The networks learned were averaged to obtain a more robust model.The averaged network structure was obtained using the arcs present in at least 85% of the networks, which gives a measure of the strength of each arc and establishes its significance given a threshold (85%) (see Figure 1 and Listing 1).
Parameters were obtained again with the bnlearn package in the R language by performing a Bayesian parameter estimation using a Dirichlet distribution [46].A conditional probability distribution is obtained for each node.It was built with the tabu learning algorithm from the bnlearn package in the R language using a threshold = 0.85.

Naive Bayes
The NB model assumes that instances fall into one of a number of mutually exclusive classes, and it is the simplest BN classifier, where the predictive variables are assumed to be conditionally independent given the class: Even though the assumption of conditional independence is violated on numerous occasions in real applications, NB still performs well in many situations [41].In Figure 2, three NB structures for leading positive feedback, cooperative manager and collaborative manager are obtained from the bnlearn package [43] in the R language [44] (see also Listing 2).The joint probability distribution for the NB models in Figure 2 by simple probabilistic reasoning and then the conditional independence assumption for a NB model are obtained as follows.
Suppose C = leadingPositiveFeedback, X 1 = expectations, X 2 = collaborativeManager, X 3 = cooperativeManager, X 4 = roles, X 5 = taskCohesion, then: If we suppose C = cooperativeManager, X 1 = taskCohesion, X 2 = social Attraction, X 3 = taskIntegration, X 4 = collaborativeManager, X 5 = leadingPositiveFeedback, X 6 = roles, X 7 = expectations, X 8 = taskAttraction, then we can have: Finally, we can assume that C = collaborativeManager, X 1 = taskAttraction, X 2 = taskIntegration, X 3 = cooperativeManager, X 4 = leadingPositiveFeedback, X 5 = expectations; we can obtain that: After the above steps, we have factorized the joint distribution as a product of conditional probability distributions.In an NB model, given the parent node, the remaining nodes are independent.Given the parent node, for each pair of nodes X i and X j , by applying the symmetry property, the X i node is independent of X j and reciprocally.

Tree Augmented Naive Bayes
The TAN classifier [38] extends the NB model with a tree-shaped graph across the predictor variables.The TAN model is similar to NB, except that each predictor variable is allowed to depend on other predictor variables in addition to the class.Figure 3 shows three models that we have learned for leading positive feedback, cooperative manager and collaborative manager (see also Listing 3).This model provides more information than the NB model, as it included information about the relationship among all predictor variables.Let us note that the joint probability distribution for the TAN models in Figure 3 by simple probabilistic reasoning and then using the conditional independence assumption for a TAN model is as follows (from left to right): We can assume that C = leadingPositiveFeedback, X 1 = expectations, X 2 = collaborativeManager, X 3 = cooperativeManager, X 4 = roles, X 5 = taskCohesion: Now, let us suppose C = cooperativeManager, X 1 = taskCohesion, X 2 = social Attraction, X 3 = taskIntegration, X 4 = collaborativeManager, X 5 = leadingPositiveFeedback, X 6 = roles, X 7 = expectations, X 8 = taskAttraction, then: As another step, we can assume that C = collaborativeManager, X 1 = taskAttraction, X 2 = taskIntegration, X 3 = cooperativeManager, X 4 = leadingPositiveFeedback, X 5 = expectations; then, we will obtain: Up to this point, we obtain the probabilistic models factorized according to the TAN models, which are special types of BNs; see Figure 3.The independence of the social attraction and leading positive feedback features given the roles and cooperative manager features is shown in the center of Figure 3; reciprocally by symmetry, leading positive feedback is independent of the social attraction given the roles and cooperative manager features.Similarly, on the right side of Figure 3, cooperative manager and task integration appear as independent features given the collaborative manager and task cohesion features.

Validation
The models are validated using a 10-fold cross-validation.In Table 1, the area under the ROC curve (AUC) and the percentage correctly classified for the different features in the BN model are shown.

Performance Comparison
In order to provide reference benchmarks about how our BN classifies the features, we compare it to other classification performances.The performance of each classification model is evaluated using four statistical measures: accuracy, sensitivity, specificity and precision.Classification accuracy is defined as the ratio of the number of correctly-classified cases.The sensitivity refers to the rate of correctly classified as positive.The specificity refers to the rate of correctly classified as negative, and the precision refers to the rate of positive predicted value.Other classifiers have also been considered to compare to the Bayesian classifiers: the multilayer perceptron (MP) [48], logistic regression (LR) [49], Id3 [50] and random forest (RF) [51] algorithms.The best performance is obtained by the average BN and TAN models (see Tables 2-4).However, some of these models also show a low specificity for the collaborative manager model and leading positive feedback by increasing the Type I error (one-specificity).Furthermore, BNs are able to produce probability estimates.In this sense, we are interested in knowing the features with the highest influence in maximizing collective efficacy, cooperative manager and collaborative manager in the high and low states.

Conditional Entropy
In Shannon [47] theory, the entropy of X is the lower bound on the average number of bits that are needed to encode values of X.Another way of viewing the entropy is as a measure of our uncertainty about the value of X, i.e., lower uncertainty about X will produce a low entropy value.
A natural question is what is the cost of encoding X if we have already encoded Y.The conditional entropy of X given Y is: which captures the additional cost (in terms of bits) of encoding X when we have already encoded Y.
Note that the maximum value of probability in P(X|Y) implies the lowest entropy value.
For the leading positive feedback, cooperative manager and collaborative manager features, we are interested in determining and ordering the state values for conditioned features, because we can obtain the maximum probability value in the low state and the high state.This will lead to achieving the desired minimum conditioned entropy.

Results
The BN model has been selected to rank the features that can mostly increase the probability value of leading positive feedback, or cooperative manager, or collaborative manager in a specific state value.
Given the evidence E = e, our goal is to find the most likely assignment to the remaining variables, denoted by U; see Equation (12):

BN Model for Leading Positive Feedback
We select at each step the feature that maximize the most leading positive feedback feature likelihood in a high state, i.e., we choose from each step the variable and the state that induces the greatest increase in the likelihood of leading positive feedback variable in a high state.A summary is shown in Table 5 and Figure 4 (on the right side).
As shown in Table 5, we can use a step by step instantiation method to maximize the likelihood of the "leading positive feedback" feature.At Step 1, with no values set for other features in the initial BN, we can reach a likelihood of 86.7%.At the subsequent steps, we add the conditions for the instantiation of other features one by one as indicated in Table 5; in each step, the percentage of the likelihood of the "leading positive feedback" increases accordingly and finally reached 96.3%.Tables 6-10 list the same step by step method to maximize the likelihood of other features.

Table 5.
Step-by-step instantiations leading to maximization of the likelihood of the leading positive feedback variable in its high state.

Step Instantiated Variable
Value Leading Positive Feedback = High Again, we choose from each step the variable and the state that induces the greatest increase in the likelihood of the leading positive feedback variable in a low state.A summary is shown in Table 6 and Figure 4 (on the left side).task cohesion = Low 77.6% q q q q q q q q q q q q Cooperative Manager = High Cooperative Manager = Low Probability for Leading Positive Feedback Step by step instantiations in the BN to maximize the leading positive feedback feature values in high and low states.The horizontal line represents the different steps from Tables 6 and 7.

BN Model for the Cooperative Manager
We select at each step the feature that maximize the greatest cooperative manager feature likelihood in a high state, i.e., we choose from each step the variable and the state that induce the greatest increase in the likelihood of the cooperative manager variable in a high state.A summary is shown in Table 7 and Figure 5.

Table 7.
Step-by-step instantiations leading to the maximization of the likelihood of the cooperative manager variable in its high state.

Step Instantiated Variable
Value Cooperative Manager = High q q q q q q q q q q q q q q q q q q Roles = High Roles = Low Step by step instantiations in the BN to maximize the cooperative manager feature in high and low states.The horizontal line represents the different steps from Tables 8 and 9.
Again, we choose from each step the variable and the state that induces the greatest increase in the likelihood of the cooperative manager variable in a low state.In this case, the cooperative manager achieves the maximum in the low state when leading positive feedback and roles are instantiated in a low value; however, to see the influence of the other features, we show the roles feature in the last position.A summary is shown in Table 8 and Figure 5 (on the right side).

BN Model for the Collaborative Manager
We select at each step the feature that maximize the greatest collaborative manager feature likelihood in a high state, i.e., we choose from each step the variable and the state that induce the greatest increase in the likelihood of the collaborative manager variable in a high state.A summary is shown in Table 9 and Figure 6 (on the right side).

Table 9.
Step-by-step instantiations leading to the maximization of the likelihood of the collaborative manager variable in its high state.

Step Instantiated Variable
Value Collaborative Manager = High Again, we choose from each step the variable and the state that induce the greatest increase in the likelihood of the collaborative manager variable in a low state.In this case, the cooperative manager achieves the maximum in the low state when the features task attraction, leading positive feedback, task cohesion, experience, cooperative manager and task integration are all instantiated in a low state.A summary is shown in Table 10 and Figure 6 (on the left side).q q q q q q q q q q q q Leading Positive Feedback = High Leading Positive Feedback = Low Probability for Collaborative M. variable Step by step instantiations in the BN to maximize the collaborative manager feature in high and low states.The horizontal line represents the different steps from Tables 8 and 9.

Discussion
In this study, ranking the psychological team features is pretty relevant attending to the manager's and coaches' great difficulties in generating specific climates addressed to obtain better performance from the team members acting as a whole team [53,55].
At present, there is still discussion about the extrapolation of the mechanisms of the teams and their tasks; however, there is a common agreement on the importance of the concept of cohesion, leadership and the specific working position [9,10,[54][55][56].From this point of view, this set of concepts has been considered as a prototype for racing sports teams and should be discussed in terms of the relationship with the group performance.In previous works [52], the relationships between some features were established.In this study, we used the Markov blanket in the average BN to determine a different subset of features.After, two BN classifiers are determined.The aim of this study has been reached through the elaboration of the BN, the subsequent TANs and some major instantiations.
Our study shows the importance of the role played by social attraction (as was predicted by Carron and Eys ([2,9,36]) and the team members' expectations, as was outlined when studying the role of self-efficacy related to the teams' performance [8].These two variables appear as the BN antecessors, influencing the rest of the variables.Furthermore, in our study, the main node has been found to be the cooperative teamwork style (as has been outlined in previous studies [13,20,52]); it determines the BN bottom consequences, the members roles and their experience, i.e., their time duration for playing on the same team.These two consequences are critical for the determination of the team performance and their stability across different situations.However, there may be a distortion of objectivity due to the team affective attraction, which can possibly affect the members' trust of the team's capabilities [53].When we conduct instantiations on the BN and maximize leading positive feedback managerial behavior in high and low states, BN showed the cooperative manager as the one with the strongest influence.A very interesting finding was that the initial expectations about the team performance need to be low in case the leading positive feedback is maximized to high, and it needs to be high in case the leading positive feedback is maximized to low, in agreement with previous work [20].After that, knowing under which features the cooperative manager would work with maximum probability in high and low states is quite important.When we instantiated the cooperative manager feature in the maximum of high and low states, BN showed, not surprisingly, leading positive feedback as the one with the strongest influence.Adding roles in high and low states, we maximize the cooperative manager in high and low states, respectively, achieving an entropy with a value of zero.The other variables that have shown influence on the cooperative managerial style are the social attraction, the specificity of the workplace and the roles when the style is maximized to both low and high states.Maximizing the collaborative manager feature in high and low states is also of special interest, showing its contribution to the main cooperative node.It appears that when we maximize the collaborative manager in a high state, the task attraction is the variable with the strongest influence, followed by the cooperative manager, task integration (instantiated in the high state), leading positive feedback and, finally, expectations (when team members behave as being focused just on reaching the team's objectives).On the other hand, the task integration can be instantiated in a low state in case we are interested in maximizing the collaborative manager to the low state.

Conclusions
In the study of sports science and other cooperative work systems, BNs provide some major advantages.They explicitly provide the conditional probability distributions of the values of every feature given the values of other input features.They are represented in a clear, appealing way, and are very easy to comprehend and translate to end users, i.e., managers and coaches.In this study, by the ranking obtained from the BN among the psychological variables, we come to the conclusion that the cooperative style with its associated behaviors constitutes the most prominent psychological variable.Managers have to work with team members' expectations and the social aspects of the job, because they are the antecessors of the whole BN, and also, they need to keep in mind that the cooperation, developed mainly through a positive feedback work climate and a pro-social, i.e., collaborative, managerial style, leads to the correct decision of the proper assignation of team roles and the working duration of the team members.

Figure 1 .
Figure1.Structure obtained by model averaging over 1000 networks.It was built with the tabu learning algorithm from the bnlearn package in the R language using a threshold = 0.85.

Figure 2 .
Figure 2. NB structures obtained using the bnlearn package in the R language for leading positive feedback, cooperative manager and collaborative manager.

Figure 3 .
Figure3.Three tree augmented naive Bayes (TAN) structures obtained using the bnlearn package in the R language for leading positive feedback, cooperative manager and collaborative manager.

Figure 5 .
Figure 5. Step by step instantiations in the BN to maximize the cooperative manager feature in high and low states.The horizontal line represents the different steps from Tables8 and 9.

Table 1 .
AUCs and the percentage correctly classified for the different features by the Bayesian network (BN).

Table 2 .
Comparison of the accuracy and other metrics for the leading positive feedback feature using different algorithms.MP, multilayer perceptron.

Table 3 .
Comparison of the accuracy and other metrics for the cooperative manager feature using different algorithms.

Table 4 .
Comparison of the accuracy and other metrics for the collaborative manager feature using different algorithms.

Table 6 .
Step-by-step instantiations leading to the maximization of the likelihood of the leading positive feedback variable in its low state.

Table 8 .
Step-by-step instantiations leading to the maximization of the likelihood of the cooperative manager variable in its low state.

Table 10 .
Step-by-step instantiations leading to the maximization of the likelihood of the collaborative variable in its low state.