Intensity of Bilateral Contacts in Social Network Analysis

: The approach presented here introduces the use of directed and weighted graph indicators in order to incorporate the intensity of bilateral contacts. The indicators are tested on a reference email network, and their applicability in explaining the role of each individual in the organization is explored. The results suggest that directional indicators have high explicatory relevance and can add value to conventional Social Network Analysis (SNA) approaches.


Introduction
With the growing role of social networks in everyday life-the Digital Society combined with the abundance of data that can be collected and analyzed, the field of Social Network Analysis (SNA) is burgeoning. Typical SNA applications include the identification of the most important, critical, influential person or link, the identification of main hubs or communities, the interactions between individuals in a network, or the visualization of the flows of information. SNA has been used in a large variety of fields, such as bibliometric analysis [1], regional development [2], emergency management [3], health policy [4], or airport competition [5].
SNA methods for large complex networks, however, still face several difficulties. Frequent weaknesses include the ignorance of edge weights, the limited consideration of topology, and the computational complexity [6]. The validation of the results of SNA methods has often proven to be problematic [7], while the visualization and physical interpretation of the results can be challenging [8]. As possible solutions, Reference [9] proposed an approach that explores how complex networks are organized by higher-order connectivity patterns. They confirmed that motifs of connectivity at local scale can reveal the role of network elements and provide insights regarding the efficiency of the full network.
The approach presented here aimed to extend the analysis of social networks by combining topology, traffic data, and behavioral aspects. The approach:  explores the application of recent advances in SNA methods that allow the consideration of weights and direction in the calculation of clustering coefficients;  proposes an approach to extract indicators that describe the behavioral aspects of the social network members; and  develops a model that can explain the actors' behavior through SNA indicators.
The main novelty of the approach proposed here is two-fold. As a first step, the introduction of weighted and directed indicators in standard SNA adds useful information on the direction of intensity of social network interactions. As a second step, the application of small world clustering coefficients, also using the direction and weights of the connections, can improve the interpretability  type of network analyzed;  indicator type: Structural (conventional graph theory indicators), Activity (accounting for traffic, intensity or frequency of connections), or Clustering (local, "small world" indicators);  use of direction in connections; and  weights used (in the case of weighted indicators).  Adamic and Adar (2001) [12] Web page network Structural Yes Structural (number of links) Watts and Strogatz (1998) [14] Biological network; Collaboration network Clustering No No Barrat et al. (2004) [13] Aviation network Structural; Activity yes Activity (available seats per year) Fagiolo (2007) [15] International Shifting from the description of the structure of a social network to the identification of the social roles of the members of the network extends the analysis from one with a mainly topological focus to one encompassing behavioral aspects. Tang et al. (2012) [23] compared networks from phone calls, emails, bluetooth scanning, and news sharing and developed a factor graph model to infer social relationships among the network members. Chen (2013) [24] found a correlation between user personality traits and social network activity. Saqr et al. (2018) [25] associated success of students in university examinations with their online interactions during each course.

Email Network Data and Main Indicators
The data used here were extracted from the Stanford Network Analysis Project (SNAP) "email-Eu-core-temporal" network, a well-known reference dataset for Social Network Analysis (SNA) of email traffic [26,27]. Email activity networks are a representative form of social networks. A number of individuals can be linked according to their interaction in terms of messages sent and received, while the number of interactions, their frequency, and the time differences may reveal information about the strength of bilateral relationships. Seen from a wider perspective, the different patterns of all bilateral relationships across the network can provide information on the role of each individual in the organization. Studying email networks, therefore, can be useful for the analysis of the operation of an organization.
The network was generated using real email traffic data from a large European research institution. Anonymized information about all incoming and outgoing email of the research institution was collected during 18 months. The information retained consists of the (anonymized) sender, the (anonymized) receiver, and the timestamp of the message dispatch. To convert the set of email messages into a network, each email address is considered a node. A directed edge between nodes i and j is created if i sent at least one message to j. SNAP also provides an additional dataset, "email-Eu-core-department-labels", which associates each individual email address to one of the 42 departments of the research organization. The resulting network consists of 986 nodes (unique email addresses). Since 21 email addresses had only outgoing messages within the institution, and 162 email addresses had only incoming messages from within the institution, there are only 824 transmitting nodes and 965 receiving nodes. Membership to a department ranges from 1 to 109, with a mean of 23.93 members and a median of 14.5.
The email activity dataset consists of 332,334 observations, each corresponding to an email send by an anonymized user id to another anonymized user id, with the corresponding timestamp. A graph representation of the email network can be easily constructing by assuming that each member of the network is a node of the graph. These observations can be considered as dynamic edges in the graph describing the network. The number of bilateral links regardless of the number and direction of the interaction, i.e., the static edges, is 24,929. The timestamps of each message allow the analysis of the dynamics over time.
The number of emails sent by each individual is highly correlated with the number of emails received (Pearson correlation = 0.747). Figure 1 suggests that the linear relation between the two numbers and the absolute levels of the two values are independent of the department where each individual belongs to. Email activity appears to be an effect of the individual's role within the department and the organization at large, rather than an attribute associated to the role that each department has inside the organization. The correlation between the number of sent and received emails is even higher when summarized at department level (Pearson correlation = 0.967). The number of emails sent by a department's members to members of other departments is proportional to the number of emails received from other departments. In addition, even though there is significant variance in the number of emails sent or received by each individual, the aggregate figures at department level are, to a large extent, proportional to the number of individuals in each department ( Figure 2). Even though there is significant variance among individuals in regard to email activity, the email flows between departments is symmetrical. Given that the email traffic network has a high number of nodes and connections, it is practically impossible to visualize its structure in a meaningful way. Even when summarizing at department level, mapping the connections between the 42 nodes in order to identify patterns in the flow of information is still a complicated task. Figure 3 summarizes the top 10% of directional bilateral email flows between departments. While a few departments appear frequently in these bilateral flows, the pattern of connections implies that no department is dominant in terms of intra-institutional email traffic. The basic statistics of email activity during the period comprise of the traffic data for the network at individual ( Table 2) and department (Table 3) level. There is high variance in regard to the number of emails each member of the network sent or received during the period, as well as in the ratio between the two. There are a few members that dominate the institution in terms of the number of emails sent. Twenty members sent more than 2500 emails during the period, perhaps due to an information dissemination role that they may have. But sending many emails does not necessarily result in receiving many (or vice versa). Several different profiles seem to be present in the dataset, again, probably due to the different roles in the institution and the communication patterns each individual prefers.

Graph Theory Indicators
The basic traffic data across the network provides an overview of the activity but is not sufficient to explain the role of each individual in the organization. Graph theory indicators are commonly used in social network analysis in order to describe the topology of a network and explain the relationships among its members. The three most frequently used indicators address centrality, a measure of the importance of each individual (corresponding to a node in the network) within the social network: degree centrality, closeness centrality, and betweenness centrality. All three centrality indicators were introduced by Freeman in 1979 [28] and form the basis of current SNA methods.
Degree centrality is the simplest expression of centrality and corresponds to the total number of existing connections between an individual node and the other nodes of the network. If = 1 when a connection between nodes i and j exists, and = 0 in the opposite case, the basic definition of degree centrality is: Normalizing the indicator adjusts for the network size by expressing a node's centrality as a share of its maximum possible level, when connections with all other nodes in the system are present. If N the number of nodes in the network [29]: For non-directed networks, = is assumed. In most social networks, however, and particularly in the email network used here, connections are asymmetric, and ≠ is quite frequent in the flow of information. In such a case, the two variants of degree centrality assuming a directed graph may differ: In practice, the degree centrality in a network of email flows coincides with the share of unique senders of emails received by each individual (in-degree) and the share of unique recipients of emails sent by each individual (out-degree).
Closeness centrality is an indicator of the centrality of each node based on the distance between each individual node and all other nodes in the network. It is calculated as the as the reciprocal of the sum of the length of the shortest paths between node i and all other nodes in the network [30]: where is the distance (number of edges) between nodes i and j. Normalization, in order to account for the network size, is performed by multiplying closeness by N-1, where N is the number of nodes in the network. The two directional forms of closeness after this transformation correspond to the inverse of the average distance from each node [31]: Betweenness centrality measures the number of shortest paths between all other nodes of the network that pass through an individual node. If is the number of all shortest paths between all other nodes in the network, and ( ) is the number of those shortest paths that pass through i, betweenness centrality is calculated as: * = ( ) As in the case of the other centrality indicators, betweenness centrality can also account for different weights that express distance and differentiate between the directions of the connection. The general case, thus, can be transformed into: The calculations of these standard centrality indicators for the reference email network used here were done with the igraph software package in [32]. The results for the main indicators are summarized in Table 4, with their standard and, where applicable, weighted and/or directional expressions. Degree centrality is a direct reflection of the number of unique individuals within the network that an email was exchanged with. The basic expression, normalized but non-directional, does not distinguish between sending or receiving an email. On average, each individual has been in contact with (in the sense that an email was sent to or received from) 6.8% of the individuals in the network during the period covered (67 out of a total of 985). The reach of contacts ranges from a minimum of 0.03% (3 individuals) to close to 55% (540 individuals).
If the direction of the email flow is taken into account, the directional version of degree centrality allows more detail in the characterization of each node. The sum of the two directions of directed degree centrality equals the undirected degree centrality for each node. Nevertheless, the differences in the distribution of values reflect the asymmetry in the number of unique senders and respondents in the network and the varying patterns in email activity of individual members of the network. Degree centrality is positively skewed, with its distribution having a long tail towards values of high centrality. This is the result of a low number of nodes being highly central in terms of the number of individuals they exchanged emails with, either because they sent emails to a higher proportion of the network than the average (out-degree) or because they received proportionally more (in-degree). The skewness of out-degree centrality is significantly higher than that ofin-degree due to the dominant role of a few members as sender of emails.
The closeness centrality indicators reflect a certain degree of symmetry in regard to their distribution statistics. The average member of the network is equally close to the center, regardless of whether incoming or outgoing email flow is considered. However, closeness for outgoing emails has a lower standard deviation and higher skewness than for incoming emails. This probably signifies that the relatively few members who send emails to a large part of the network act as an efficient channel of information flow across the network. The mean values for closeness centrality in Table 4 correspond to an average distance of 2.654 edges for outgoing emails and 2.652 edges for incoming emails, confirming the observation that the email network analyzed here is dense and highly connected.
Betweenness centrality presents some small differences when the direction of the email flow is taken into account. While the two values are highly correlated at node level (Pearson correlation= 0.965), individuals with a large imbalance between the numbers of incoming and outgoing messages do have a marginal influence on the overall distribution.

Extending Clustering Approach
As described in Section 1, recent advances in SNA indicate the importance of identifying clusters of individuals and communities in social networks. A simplified approach may consider binary unidirectional connections between individuals. While such simplifications may not influence the overall analysis in certain types of social networks (e.g., club membership), they may distort the findings significantly in networks where the direction and intensity of information flow can be of major importance (e.g., Twitter, email or bibliographic network analysis). The local clustering coefficient was introduced in the seminal paper of Watts and Strogatz [14] as a measure to determine whether a graph is a small-world network by quantifying how close a node and its neighbors are to being a clique. The clustering coefficients are used here in order to explore the hypothesis that email networks present Small World characteristics. The underlying question is whether using such coefficients-and in particular their directional form-improves the understanding of the behavior of the members of the network.
The clustering coefficient of node i is equal to the number of triangles connected to this node divided by the number of triples (i.e., potential triangles) centered on it [14]: where is the number of triangles formed between node i and its possible neighbors, and is the degree of the node (the number of individual connections). Opsahl and Panzarasa (2009) [33] extended the definition of the clustering coefficient to weighted networks. Similar to the case of adding weights and direction to the standard centrality indicators in Section 4, assuming directed clustering in weighted networks can provide additional insight into the structure and dynamics of a social network. Nevertheless, a node can be part of triangles with arcs pointing in different directions. Four types of triangles can be distinguished [11,13] A directed clustering coefficient can be specified for each of the above cases, in order to account for the different patterns. Each coefficient is defined as the number of triangles of i with a specific pattern of arc directions, divided by the number of potential specific triangles of i. If a binary variable to indicate whether there is a connection between i and j or not, and is the number of emails sent from i to j (and, consequently, ≠ 0 if and only if = 1), the four clustering coefficients can be defined as: where ↔ is the strength of the connection between node i and its adjacent nodes j, expressed as: The clustering coefficients were calculated with the DirectClustering package, are summarized in Table 5, and are visualized in Figure 5. An in-clustering coefficient with a value of 1 corresponds to the cases of individuals who were on the receiving side in all triangles formed by their email exchanges. This was the case for 14 individuals in the sample. The mean of the indicators ranges from 0.4413 to 0.4828. Variation appears to be higher for the In-and Out-coefficient than for the Cycle and Middleman. In terms of skewness, Out-and Cycle coefficients have a higher value. The four coefficients present a high level of correlation, which is to a certain degree expected. The majority of the individual members of the network would participate in various forms of triangles with different directions and intensities of email flow. The number of emails sent to an individual is also highly correlated with the number of emails received from the same individual, and, consequently, the indicators weighted on the intensity of the bilateral connections will inherit the correlation. The high correlation levels can be useful in pattern analysis, especially in cases of networks where one would expect uniform distributions. The outliers can provide significant information about their role in the organization or identify a behavior that is not expected. Especially in regard to the relation between the In-and Out-clustering coefficients, the explanation of the part of their relation that is not explained by collinearity may be useful in identifying individual or organizational patterns. For example, a high in-clustering coefficient may signify the individual's role as a transmitter of information from the local cluster to the rest of the system. If the same individual has a low out-clustering coefficient, the flow of information in the opposite direction (from the rest of the system to the local cluster) is more limited. Depending on the case, this may be the result of the hierarchical structure of the local cluster (i.e., the individual is the manager of the team formed by several triangles) or a symptom of imbalances in the flow of communication. In practical terms, the graphical representation of the correlation between the four clustering coefficients ( Figure 5) facilitates the identification of outliers.

Symmetrical and A-Symmetrical Models
Having presented the differences between symmetrical and directed indicators, the question can be transformed into whether using one or the other type of indicators influences the quality of the analysis of patterns in a social network. An experiment can be made by using a variable that expresses an operational characteristic of the network-independent from its topology-and estimate how the various centrality and clustering coefficients explain its variation. Given the data available in this dataset, a suitable variable that is independent from the individual network measures is the reaction time to emails. The dataset provides the timestamp for each email, information that was not used in the calculations of the indicators in the previous sections. While the reaction time between emails does not necessarily correspond to the time that has passed for a specific email to be responded, it still provides a quantifiable indicator of the temporal dimension of the email interaction between two members of a social network. Shinkuma et al. [34] suggest that the frequency of interaction can be used as an indicator to characterize interpersonal communication in the network graphs.
In the experiment used here, a new indicator is constructed based on the email timestamps included in the dataset. If only the email exchanges that were bilateral during the period covered by the dataset (i.e., = = 1) are taken into account and on the premise that the timestamp difference in an email exchange between two individuals is a proxy of the response time, the exchange of emails between i and j would have the form of a series that can be ordered by time.
The series would have n elements, where n is the number of emails between i and j, regardless of direction. The number of emails from i to j would be equal to k, where k < n. Each email has a timestamp {tn > tn-1}. The response times of i to the emails sent by j can be calculated as the difference between the timestamp of each email → and the last unanswered email → : The timestamp of the last unanswered email would correspond to the timestamp of the latest email → : In a similar fashion, the response times by j to emails sent by I can be calculated as the difference between the timestamps of each sent email → and the next received email → , if any: The timestamp of the next email → would be that of the first email → received after each → : An example of the calculation of response times is given in Table 6. The average time of response by an individual i would be: = ∑ ∑ (20) while the average speed of responses received would be: The average speed of reply is, however, very sensitive to the period of analysis used and to the specific day or hour that a specific email was sent. A more suitable indicator of the relative importance of an individual in the network could be the share of outgoing emails that were responded within a certain time threshold.
The formulation that calculates the share of responses in the opposite direction within the threshold can be used as an indicator of an individual's own average speed of response.
Two different indicators are tested, with a 7 d and a 24 h threshold, respectively. For each indicator, three different models that explain the variation were developed:

Share of Outgoing Emails Responded Within 7 Days
The comparison of the three models that use a 7 d threshold is summarized in Table 7. The main statistics indicators that are significant in all three models are the number of emails sent (∑ ) and the individual's own speed in replying ( , ). This suggests that there is a high degree of reciprocity in an individual's email activity. The individuals in the network examined here who sent more emails, on average, received faster responses. At the same time, the individuals who reply fast have their emails also replied to fast. Both variables suggest that the more active the role of the individual is in the system, the stronger the role is that the individual has in the network (at least as far as the dependent variable expresses such strength). Of course, it is possible that the causal relationship has the opposite direction, i.e., the faster that an individual's emails are responded, the higher the number and faster the responses of the individual. Significance codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1.
The conventional model uses the two variables above and the individual's closeness centrality indicator. The relation is positive, meaning that, the closer the individual is to the center of the network, the higher the share of the individual's emails that are responded within 7 d. Neither degree nor betweenness appear as significant variables, suggesting that the speed of replies is not a function of the number of individual connections nor of the number of shortest paths that an individual node forms part of.
The clustering model, which uses the directed clustering coefficients, suggests that three directed coefficients can be useful in interpreting an individual's role in the network. There is a correlation with both the In-and Out-clustering coefficient and a negative correlation with the cycle clustering coefficient. This indicates that the participation in triads which have all three nodes communicating with each other tend to have a more active role in the overall network. Conversely, if the individual acts simply as a middleman, i.e., is part of the weaker communication channel in a triad, the individual's role in the network tend to be less active, at least measured in terms of the time for emails to be responded. The Extended Directional model combines centrality and clustering indicators accounting, in both cases, for direction and weights. This model maintains the main independent variables of the other two approaches, rebalancing their respective estimates and resulting in a visible improvement in accuracy. The R 2 coefficient of the Extended Directional model is 0.8596 compared to 0.8468 and 0.8428 for the other two models, respectively. Closeness is considered in its directional version, which results in its weight in the model to be split in two. The two directions are not symmetrical though, with the in-closeness centrality having a negative correlation which roughly counter-balances the positive impact of the out-closeness one. The three clustering coefficients remain significant in the Extended Directional model, maintaining the direction of the influence, with small changes in the estimates. The estimates of In-and Out-clustering coefficients converge to comparable levels, while the estimate for the Middleman coefficient decreases further.
The difference when direction and weights are taken into account to explain variation is noticeable, and the accuracy of the model increases. The department that each individual belongs to does not appear as significant. The three standard graph theory indicators, Degree, Closeness, and Betweenness, appear to be inter-related, even in their directed version, and, consequently, only Closeness appears as significant.

Share of Outgoing Emails Responded Within 24 Hours
A second set of models that use a threshold of 24 h for the delay in responses is summarized in Table 8. All three models have a lower accuracy than their corresponding version that uses 7 d as a threshold. This is probably a result of the time scale used, since counting the delay this way would include weekends and distort the speed in response. Even so, this threshold is useful for confirming the robustness of the approach. All independent variables for all three model configurations present the same direction in their impact on the dependent variable, with estimates having the same order of magnitude and showing a comparable level of statistical significance. Significance codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1.
The reciprocity in the speed of responses remains. Individuals who respond within 24 h have a higher probability of their emails being responded within 24 h. High activity in terms of the number of emails sent is again an important indicator. The opposite directions of the estimates for the two directional closeness centrality indicators confirms the observation made in the 7 d model, i.e., that out-closeness is positively correlated with speed of responses received, while in-closeness has the contrary effect. The In-, Out-, and Middleman clustering coefficients also corroborate the results of the 7 d model.
The results of the two models are robust enough to allow a generalization of the interpretation of each variable. The individual's own activity in terms of number, frequency, and destination of emails appears to reflect the relative importance of the individual in the network expressed as the share of timely responses to an individual's email within either a 7 d or 24 h period. In terms of activity, the frequency and response delay of the individual's own emails are a determinant of how others respond. The higher the number of emails sent and the faster emails are replied by an individual, the faster the responses of the individual's correspondents can be expected to be. From a topology perspective, taking into account the asymmetry of closeness centrality allows a more detailed evaluation of its impact. A central role as an emitter of information (high Out-closeness) increases the probability of quick reactions by others. In the opposite case, simply being close to the center as receiver of information (high Out-closeness) decreases this probability. Small world aspects can also explain part of the role. The middleman clustering coefficient has a clearly negative correlation with response time, something that suggests that there is a correspondence with the importance of an individual's role in the system. In-clustering and out-clustering both have a positive correlation, possibly indicating that the individuals with high coefficients are a link between local clusters and the rest of the system.

Validation of Methodology on Alternative Datasets
The methodology presented here appears to applicable in the case of the reference email network, providing results that, to a certain extent, allow some insights on the structure and dynamics of the organization. A question, though, is whether this approach can be generalized to other type of networks, either of email or other social network activity. In order to test the robustness of the approach, the models developed using the email-Eu-core-temporal network (Sections 6.1 and 6.2) are applied on two different validation networks. The first validation network is "CollegeMsg", a dataset of private messages sent on an online social network at the University of California, Irvine, originally used in Reference [35]. Users could search the network for others and then initiate conversation based on profile information. The dataset consists of 59,835 messages among the 1899 members. It includes the anonymized identity of the sender and recipient, as well as the timestamp of the message. The timespan of the dataset is 193 days and permits the calculation of response times. Compared to the email-Eu-core-temporal dataset (330 thousand emails between 986 members, over 18 months), CollegeMsg is less dense and has a shorter timespan. In addition, the dynamics of participating in a message board differ from that of an email network in term of speed, frequency, and intensity of interaction between members.
The second dataset used for validation is the Enron email network [36]. It consists of 52,587 nodes and 517,399 emails, in a timespan of more than four years. The dataset is not limited to Enron employees but covers all email exchanges with an Enron employee as a sender or recipient. On the other hand, there is no information available on the hierarchical structure inside Enron, since only email addresses are included. As a result, while it is a much larger dataset than both email-Eu-core-temporal and CollegeMsg, it is less dense and has a longer timespan than both. Table 9 summarizes the fit for the 7 d and 1 d responses applied on each of the three datasets. In all three cases, the extended directional model explains variation better than either the conventional or the clustering model. The fit for the smallest dataset (CollegeMsg) is better for both the timescales used and in all three model variations. In contrast, the large Enron dataset has a lower (though still acceptable) R 2 but also demonstrates a visible improvement when the Extended Directional model is applied. It is also notable that the 1 d timescale has a better fit than the 7 d timescale in the case of Enron. This comparison confirms that the explanatory power of the model increases in all cases when directional coefficients are used. The overall accuracy, though, depends on the specificities of each network. Different parameters, especially in regard to the timescale used, would probably improve the results of the comparison.  Table 10 compares the estimates for each variable used in the extended directional model for each dataset. The main observations made in the analysis of the email-Eu-core-temporal network still hold true. The users own activity in terms of number of emails and own time to respond explains a large part of variation. The estimates for in-and out-closeness normally have opposite signs, as do the ones for in-and out-clustering. The results of the validation suggest that the main principles of the methodology presented here are still valid. Directional indicators, including the Small World indicators, do add value to the information and improve the predictive models in both alternative networks tested. The detailed results for the two validation networks can be found in the Appendix A.

Conclusions
The work presented here addressed options to improve current practice in Social Network Analysis (SNA). The main novelty of the approach proposed is two-fold. The introduction of weighted and directed indicators in standard SNA adds useful information on the direction of intensity of social network interactions. In addition, the application of small world directional clustering coefficients improves interpretability. The experiments using a reference email network demonstrate that taking into account the direction and weights of the social network connections can improve the understanding of the network patterns and operational aspects.
The results also suggest that a hitherto perceived complication of social network-the lack of symmetry-can, in fact, be seen as an advantage. Directed and weighted cluster coefficients may explain part of the variance in email activity, which is a promising candidate for explaining an individual's role within an organization. In addition, the significance of the local clustering coefficients in all experiments with the three different social networks confirms the hypothesis that email networks present certain Small World elements.
While the method and indicators were tested only on three networks, it is probably applicable to other types of social networks, regardless of scale and structure. Most social networks entail asymmetries in the flow of information and the relation between its members. The results presented here suggest that there is a correlation between this asymmetry and the role of each member in the network, an observation that can be probably extended to other social networks.
The approach obviously has several limitations that should be kept in mind when interpreting the results. The specific data used are a snapshot for a specific period in time. Most probably, the patterns of social network activity change over time either due to the own network's dynamics or as a result of the introduction of alternative social networks. In the case of emails, the gradual introduction of competitive communication tools, such as intranet or instant messaging, may modify the importance of this channel of communication. Another major caveat is the lack of objective evaluations of an individual's role so that a model that correlates it to SNA indicators can be developed and validated. This is, however, a general weakness of SNA-based analysis, since it is seldom possible to demonstrate a causal relationship between social network activity and a real life assessment of an individual's importance.