Using an Exponential Random Graph Model to Recommend Academic Collaborators

: Academic collaboration networks can be formed by grouping di ﬀ erent faculty members into a single group. Grouping these faculty members together is a complex process that involves searching multiple web pages in order to collect and analyze information, and establishing new connections among prospective collaborators. A recommender system (RS) for academic collaborations can help reduce the time and e ﬀ ort required to establish a new collaboration. Content-based recommendation system make recommendations based on similarity without taking social context into consideration. Hybrid recommender systems can be used to combine similarity and social context. In this paper, we propose a weighting method that can be used to combine two or more social context factors in a recommendation engine that leverages an exponential random graph model (ERGM) based on historical network data. We demonstrate our approach using real data from collaborations with faculty members at the College of Computer and Information Sciences (CCIS) in Saudi Arabia. Our results demonstrate that weighting social context factors helps increase recommendation accuracy for new users.


Introduction
Scientific collaboration is one of the defining features of modern science [1].The quality of higher education has been linked to effective collaborations [2].Additionally, collaborations can lead to high-impact research and development with many commercial applications.However, collaborations require researchers to build a social network consisting of people with similar scientific interests, and finding such people can require substantial time and effort.A recommendation system (RS) facilitates the process of identifying and finding academic collaborators, thereby increasing the number of collaborations.
Many collaborator RSs have been developed in recent years, but most are based on traditional approaches, such as the content-based approach, and employ fairly simple user models.These approaches ignore the fact that users interact with each other within a particular context and that the preferences of collaborators within one context may differ from those in another.A generic hypothesis of network science is that an actor's position in a network can determine the constraints and opportunities that he or she will encounter; therefore, identifying that position is critical for predicting outcomes and behavior [3].Moreover, evidence from the literature [4,5] suggests that collaboration patterns and dynamics vary across scientific communities, fields, and individuals, which makes it important to consider the context of a collaboration before making recommendations.A context-independent collaborator RS could lose predictive power because potentially useful information from multiple contexts would be ignored.
Context-aware RSs generate more relevant recommendations by adapting recommendations to the specific contextual situation of a user.According to [6], "Context is any information that can be used to characterize the situation of an entity.An entity is a person, place, or object that is considered relevant to the interaction between a user and an application, including the user and applications themselves."Depending on the type of data used, four types of contexts are identified: the physical context represents physical attributes; the social context represents the presence and role of other people around the user; the interaction-media context describes the device used to access the system; and the modal context represents the user's current state of mind [7].
Hybrid recommender systems can be used to combine similarity and social context information.Hybrid approaches make recommendations by combining two or more methods to maximize the strengths of different approaches and overcome a given approach's limitations.Different hybrid methods have been suggested in the literature [8].A popular approach for hybridization in recommender systems is the weighted method.In the weighted method, different methods are implemented separately and their predictions combined.
Few studies have combined user similarity and social context, and even fewer studies have discussed methods to weight relevant contextual factors.This research proposes an approach for context-aware recommender systems (RSs) that combine research area similarity with social contextual information.This approach includes a method for weighting similarity and different social context factors.The approach is based on modeling historical collaboration using an extended version of a class of principled statistical models called exponential random graph models (ERGMs), that involve several estimating, validating, and simulation experiments.
The remainder of the paper has been organized as follows: Section 2 describes the background, Section 3 examines related work, Section 4 provides an overview of our approach, Section 5 demonstrates the implementation procedure and presents our results, and Section 6 discusses these results.Section 7 compares our approach with others, and, finally, Section 8 describes the research limitations and suggestions for future work.

Collaboration and Social Context
Collaborations can be viewed as social graphs in which nodes are members and edges exist between two nodes if those members have collaborated together.Using the social network perspective allows us to apply social network analysis.
Social network analysis (SNA) can be used to determine the social context of nodes (individuals) in the network.An important contextual property in social network analysis is centrality.High centrality scores identify nodes with the greatest structural importance in networks.Different centrality measures are used to measure different influence and power attributes of nodes in the network.Some of these well-known measures are as follows: degree, which allows us to find nodes that exchange with numerous others and make their views noticeable; betweenness, which allows us to find nodes critical to collaborations across communities and information flow in the network; and eigenvectors, which allows us to find nodes that are not necessarily important, but that are connected to other important nodes.Table 1 displays a summary of some centrality measures and the formulas used to mathematically quantify these measures.

Centrality Measure Definition Formula
Closeness centrality Measure of relative node i distances to the n other nodes.
(n − 1)/ j (i, j) (i, j) is the length of the path between i and j.

Betweenness Centrality
Measure of extent to which a node lies between other nodes in the network.
i, j k P k (i,j) P(i, j) ) is the number of shortest paths between i and j.P k (i, j) is the number of shortest paths between i and j that k lies on.

Eigenvector Centrality
Measure of node centrality that takes into account neighbors' centralities.

Kats Centrality
Measures node influence within a network.
ERGM is a statistical model for examining relational data with complex dependencies.ERGM can help us determine how large of a role different factors play in creating relationships between actors and forming a network.Three types of factors can be examined using ERGM: structural factors derived from network topological structure; fixed actor attributes, such as birthplace and gender; and variable actor attributes, such as affiliation, rank, influence, and power.More formally, ERGM assigns a probability value to a graph equal to the sum of network configurations weighted by parameters inside an exponential [9,10].Each parameter corresponds to a network factor.
The general form of ERGM is given by the following equation: where is the probability of the entire graph being conditional on parameters represented by θ; • c(θ) is a normalizing constant; • θ T is a vector of parameters associated with the graph statistics; and • s(x) is a vector of the graph statistics.

Recommending Collaborators
Considering the complex nature of academic collaboration, a variety of studies have addressed RSs from different angles.For example, Damiani et al. [11] investigated the impact of RSs on team processes in computer-supported collaboration environments, which indicated that collaborator RSs increase users engagement.
In [12], the authors proposed different types of RSs that aim to enhance and increase collaboration among researchers in different scientific communities by pointing to other projects, researchers, and related topics.In [13], the authors suggested developing an RS that focuses on helping undergraduate students by recommending research opportunities.In [14], the authors discussed their challenges and experiences developing research article RSs for digital libraries and references.
In addition, a variety of methods have been proposed and used in the literature for collaborator RSs.Although collaborative filtering (CF) is the most commonly used approach for RSs, pure CF is difficult to implement in these systems primarily because a CF system will work only if group users have rated some of the same items.There is no way a new item can be recommended to a user until another user has rated it.Alternatively, many content-and hybrid-based collaborator RSs have been proposed.Most of the reviewed literature on collaborator RSs adopt methods that fall into one of two categories: CBF approaches and hybrid-filtering approaches.
Content-based filtering (CBF) methods extract researchers' academic features using tags, user profiles, publications, and other criteria.For example, Lopes et al. [15] used researchers' publication areas and the vector space model (VSM) to make collaboration recommendations.Gollapalli et al. [16] suggested models for computing the similarity between researchers based on expertise extracted from their publications and academic home pages.Content-based filtering exhibits several desirable properties, such as scalability.Furthermore, this approach works well for predicting items that are new to the system, as the only process required is to calculate the similarity between the item and the user profile.There are, however, some drawbacks to this method.As a consequence of the recommendations being based exclusively on the user's profile, these recommendations may become overspecialized.In addition, such recommendations require constant updates to user profiles.
However, in CBF, extracting user profiles and gathering all the different aspects is a demanding task.Moreover, features used to describe users' interests are usually finite and predetermined.Another important limitation is that CBF is unable to capture the semantics of users' interests.Finally, two users are indistinguishable if they are represented by the same set of features.
Hybrid filtering approaches make recommendations by combining two or more different approaches to maximize the strengths of different approaches and overcome a given approach's limitations.Many approaches can be combined to meet specific application requirements.For example, social network analysis has emerged as a source of information that can be used to feed RSs with additional information in order to increase predication accuracy.In addition, SNA can be used to gain insight into the social context of individuals in the network.
In the following section we focus on collaborator recommender systems that leverage the social context of users.

Recommending Collaborators Based on Social Context
This section focuses on hybrid RSs that combine SNA with other approaches.Many approaches based on social context have been proposed in the literature.For example, in [17] the authors proposed a hybrid algorithm combining expertise and social network information to recommend experts.In [18], the authors suggested a multi-theoretical and multi-level framework that combines social theory, SNA measures, and node attributes for the similar task of recommending topic experts.In [19], the authors combined semantic links and SNA on an academic social network to make recommendations based on the similarity between the target researcher and other researchers along a two-layer network using a spreading activation algorithm [20].The goal of the spreading activation process is to identify the nodes that correspond strongly to a given activated node and measure the similarities of nodes.
In [21], the authors used community detection and a content-based approach to recommend knowledge experts in a semantic social network of experts.In [22], the authors combined keyword similarity with properties derived from social network properties (such as distance).In [23], the authors proposed an approach for recommending influential co-authors by combining centrality and similarity.In [24], the authors combined two areas of similarity, namely the importance and activity measures of researchers, to make recommendations.In [25], the authors used a random walk algorithm to recommend collaborators.Random walks have proven to be a powerful mathematical tool for extracting information from the ensemble of paths between entities in a graph.
In [26], the authors proposed to enhance content-based RSs using academic social networks to suggest the most relevant items to members of these online societies.Their approach takes advantage of the interest and preferences of a user's friends and colleagues in providing more accurate recommendations.
Most collaborator recommender systems (RSs) based on social context linearly combine similarity and social factors based on heuristics [27,28].A heuristic is a solution, but one that will not explore all possible states of a problem.However, evidence from the literature suggests that collaboration dynamics can differ from discipline to discipline and even from location to location [28][29][30][31].Only a few studies have examined the possibility of different in hybrid collaborator recommender systems [23,28].In [28], the authors randomly experimented with different weights to find the optimal combinations for two social context factors.In [23], the authors gave users the responsibility of adjusting the weights for a single social context factor.Our approach, however, allows us to take many social context factors into consideration and systematically select weights without overwhelming users with the task of selecting weights.

RSs Based on the ERGM
Researchers in [18] pointed out advancements in social network analysis and the potential usefulness of social network modeling techniques such as ERGM in selecting relevant factors for recommending topic experts.Other researchers have proposed recommendation approaches using ERGM [32,33].
Our stratagem for the use of ERGM differs from those described in the previously mentioned studies because we use ERGM on academic collaboration networks.In addition, we have used an extended form of ERGM that takes into account actors' attributes.Both approaches, as proposed by [32,33], focus on the network's topological structure and do not include actors' attributes.

Methodology Used
This paper proposes a method for a context-aware collaborator RS that consists of two phases (Figure 1).The first phase aims to weight different contextual factors using ERGM and historical collaboration data.The output from this phase is the estimated weights for the given factors; these weights are used in the second phase to make recommendations.The following section describes each phase in greater detail.
Information 2019, 10, x FOR PEER REVIEW 5 of 16 collaboration dynamics can differ from discipline to discipline and even from location to location [28][29][30][31].Only a few studies have examined the possibility of different in hybrid collaborator recommender systems [23,28].In [28], the authors randomly experimented with different weights to find the optimal combinations for two social context factors.In [23], the authors gave users the responsibility of adjusting the weights for a single social context factor.Our approach, however, allows us to take many social context factors into consideration and systematically select weights without overwhelming users with the task of selecting weights.

RSs based on the ERGM
Researchers in [18] pointed out advancements in social network analysis and the potential usefulness of social network modeling techniques such as ERGM in selecting relevant factors for recommending topic experts.Other researchers have proposed recommendation approaches using ERGM [32,33].
Our stratagem for the use of ERGM differs from those described in the previously mentioned studies because we use ERGM on academic collaboration networks.In addition, we have used an extended form of ERGM that takes into account actors' attributes.Both approaches, as proposed by [32,33], focus on the network's topological structure and do not include actors' attributes.

Methodology Used
This paper proposes a method for a context-aware collaborator RS that consists of two phases (Figure 1).The first phase aims to weight different contextual factors using ERGM and historical collaboration data.The output from this phase is the estimated weights for the given factors; these weights are used in the second phase to make recommendations.The following section describes each phase in greater detail.

Phase One: Estimating Weights
In this phase, the weights of social contextual factors are estimated using ERGM.This process is based on the framework proposed by [9].Estimating can be done using a statistical software suite that includes the ERGM package, such as "R".This process involves selecting parameters and estimating and evaluating weights (Figure 2).

Phase One: Estimating Weights
In this phase, the weights of social contextual factors are estimated using ERGM.This process is based on the framework proposed by [9].Estimating can be done using a statistical software suite that includes the ERGM package, such as "R".This process involves selecting parameters and estimating and evaluating weights (Figure 2).

Phase One: Estimating Weights
In this phase, the weights of social contextual factors are estimated using ERGM.This process is based on the framework proposed by [9].Estimating can be done using a statistical software suite that includes the ERGM package, such as "R".This process involves selecting parameters and estimating and evaluating weights (Figure 2).Estimating is computationally intensive, involves multiple steps, and may require changing or updating parameters until the model converges (Figure 2).The final output from this phase is the estimated weight for each contextual factor.The four steps are: • Historical collaboration data: Historical collaboration data play an important role in building and testing context-aware RSs.Historical collaboration data are data related to the collaborations of a group of researchers in a particular scientific community from a previous time period.These data include historical collaboration networks, research areas of individuals in the collaboration network, and centrality scores for these individuals.The observed collaboration network is a historical collaboration network.

•
Selecting parameters: Contextual parameters that match the theories about collaboration factors must be selected.For example, because it is assumed that researchers choose to collaborate with similar and influential researchers, the following parameters are selected: research areas; social context parameters used to measure influence, such as degree centrality; betweenness centrality; and eigenvector centrality.These parameters represent different actor attributes.In addition, standard parameters corresponding to network topology can be included [34].Each parameter corresponds to a network configuration, which in turn corresponds to a network theory.

•
Estimating: Estimating can involve systematically searching through possible parameter values until the right estimate is achieved.The outputs are the estimated weights for the chosen parameters.These values are validated through evaluation.

•
Evaluating: The estimated parameters are evaluated using goodness of fit (GOF), which is a statistical approach for assessing how well estimated parameters fit the observed data using a t-ratio [33].This method is included in the ERGM package and involves a simulation of networks using estimated parameters and summary statistics.The statistics of simulated networks are compared with the actual network using a t-ratio.

Phase Two: Making Recommendations
Phase 2 involves making recommendations using the RS, which incorporates different contextual factors and their weights.A weighted hybridization method is used where social contextual factors and research areas are combined linearly, each with a different weight (Figure 3): Estimating is computationally intensive, involves multiple steps, and may require changing or updating parameters until the model converges (Figure 2).The final output from this phase is the estimated weight for each contextual factor.The four steps are: • Historical collaboration data: Historical collaboration data play an important role in building and testing context-aware RSs.Historical collaboration data are data related to the collaborations of a group of researchers in a particular scientific community from a previous time period.These data include historical collaboration networks, research areas of individuals in the collaboration network, and centrality scores for these individuals.The observed collaboration network is a historical collaboration network.

•
Selecting parameters: Contextual parameters that match the theories about collaboration factors must be selected.For example, because it is assumed that researchers choose to collaborate with similar and influential researchers, the following parameters are selected: research areas; social context parameters used to measure influence, such as degree centrality; betweenness centrality; and eigenvector centrality.These parameters represent different actor attributes.In addition, standard parameters corresponding to network topology can be included [34].Each parameter corresponds to a network configuration, which in turn corresponds to a network theory.

•
Estimating: Estimating can involve systematically searching through possible parameter values until the right estimate is achieved.The outputs are the estimated weights for the chosen parameters.These values are validated through evaluation.

•
Evaluating: The estimated parameters are evaluated using goodness of fit (GOF), which is a statistical approach for assessing how well estimated parameters fit the observed data using a t-ratio [33].This method is included in the ERGM package and involves a simulation of networks using estimated parameters and summary statistics.The statistics of simulated networks are compared with the actual network using a t-ratio.

Phase Two: Making Recommendations
Phase 2 involves making recommendations using the RS, which incorporates different contextual factors and their weights.A weighted hybridization method is used where social contextual factors and research areas are combined linearly, each with a different weight (Figure 3): Making recommendations for each user involves selecting faculty members with similar research areas, calculating the social context for each member in the network, scoring potential collaborators, and making recommendations based on scores.To identify the social context of potential collaborators, three centrality measures are used: degree centrality, betweenness centrality, Making recommendations for each user involves selecting faculty members with similar research areas, calculating the social context for each member in the network, scoring potential collaborators, and making recommendations based on scores.To identify the social context of potential collaborators, three centrality measures are used: degree centrality, betweenness centrality, and eigenvector.
More specifically, the following equation is used for each user to score all other members in the network: where Research Area is a variable that shows whether a given user and collaborator v have similar research areas; • θ x is the weight for the given factor x; and is the value of context factor x for potential collaborator v.

Implementation
Historical collaboration data that consist of publications and research areas for faculty members in 2013 were collected from the following data sources:

•
Scopus: Scopus is one of largest abstract and citation databases for peer-reviewed literature, including scientific journals, books, and conference proceedings.Publications from two years (2013, 2014) were collected for faculty members associated with the College of Computer and Information Sciences (CCIS) that are indexed by Scopus.

•
College annual report: The college annual report details the main activities and achievements of students and faculty members each year.This information includes a list of the different types of publications for each faculty member indexed in Scopus and in other citation databases.

•
Faculty websites: Every faculty has a webpage hosted on the university server that includes information about each faculty member and their teaching and research activities.
A collaboration network was constructed using a collaboration matrix and consisted of CCIS members from five different departments: computer science, computer engineering, information systems, software engineering, and information technology.Each department was assigned a different color (Figure 4).The network consisted of 168 nodes and 212 links, in which each node represented a faculty member.A link between two members indicates that the members collaborated in writing a book, conference paper, or journal article in 2013.
Phase 1 began by loading all the nodes and their research areas into the network and calculating their different centrality scores (degree centrality, betweenness centrality, and eigenvector centrality).The weights of the research area and different contextual factors were estimated.MPnet software was used (developed by the University of Melbourne) [34,35] to estimate, evaluate, and simulate the ERGM.More specifically, the following parameters were included:

•
Research_Match, which demonstrates the significance that similar research areas have on collaboration; • Degree_Activity, which illustrates the significance that degree centrality has on collaboration; • Betweenness_Activity, which indicates the significance of betweenness centrality; • Eigenvector_Activity, which identifies the significance that eigenvector centrality has on collaboration; • Edges, which is a network topology parameter in ERGM; and • Alternative Triangulation (AT), which is a network topology parameter that represents transitivity.This parameter demonstrates the significance that a common collaborator has on collaboration.
A collaboration network was constructed using a collaboration matrix and consisted of CCIS members from five different departments: computer science, computer engineering, information systems, software engineering, and information technology.Each department was assigned a different color (Figure 4).The network consisted of 168 nodes and 212 links, in which each node represented a faculty member.A link between two members indicates that the members collaborated in writing a book, conference paper, or journal article in 2013.Phase 1 began by loading all the nodes and their research areas into the network and calculating their different centrality scores (degree centrality, betweenness centrality, and eigenvector centrality).The weights of the research area and different contextual factors were estimated.MPnet software was used (developed by the University of Melbourne) [34,35] to estimate, evaluate, and simulate the ERGM.More specifically, the following parameters were included: Table 2 contains the estimation results and t-ratio values.An asterisk indicates that the parameter value is significant.Each parameter represents a different factor.The negative edge parameter indicates that the collaboration network is sparse, while the positive AT parameter indicates that there is a positive tendency toward transitivity.Transitivity means that if member a is connected to member b, and b is connected to member c, the probability of a connection between a and c is higher than any other pair of nodes in the network.Degree_Activity demonstrates that there is a positive tendency toward collaborating with members with a high degree of centrality.Finally, the results demonstrate that there is a positive tendency toward collaborating with other similar members in main research areas (Research_Match).3 displays the results of evaluating the model.The goodness of fit (GOF) indicates whether a specific model represents particular network structures well.Evaluation of the model was completed with the parameter values to simulate a distribution of graphs consistent with the model.The t-value is calculated by comparing the observed data with the collected statistics.If |t| < 2.0, then the model plausibly explains those features of the data.For the estimated model, the GOF values for all parameters were less than 2.0.

Evaluation
The RS is evaluated with data derived from Scopus regarding collaborations of CCIS faculty in 2014.Members who had at least a degree value equal to three were included (29 members: 14 old and 15 new).The dataset was divided into two groups: Finally, eight collaborators were recommended to each member in each scenario, and the results were compared with actual collaborators' data.Four types of relevant results were identified for each group: • true positives (tp): These are the correctly predicted collaborators.

•
true negatives (tn): These are the correctly predicted negative values.

•
false positives (fp): These occur when a collaborator is predicted but the actual data show this prediction to be false.

•
false negatives (fn): These occur when the RS fails to produce an accurate prediction.
Three standard and common metrics for classification tasks in RSs are used: precision, recall, and F 1 [36].Both precision and recall are based on an understanding and measure of relevance: Precision can be expressed as precision at k, where k is the length of the list of recommended items (e.g., P@1).There is usually a trade-off between precision and recall; when precision increases, recall also increases.There is, however, a measure of accuracy F 1 that combines both precision and recall:

Old Users
Old users are faculty members with old collaboration data.Part of their data was used to build the RS model, while the other part was used in evaluation.Data from the year 2013 were used for modeling and data from 2014 for testing.Eight recommendations were generated for each user for each of the following scenarios: ERGM, equal, and random.
Figure 5 illustrates the precision for each scenario.The x-axis shows the number of recommended collaborators.Initially, the ERGM scenario (Scenario 1) performed worse than the other scenarios.However, after generating a few more recommendations, the ERGM approach achieved the highest accuracy.The reason we presume is because for older users only part of their historical data was used to construct the network and make recommendations.Figure 6 demonstrates the recall for each scenario.The x-axis shows the number of recommended collaborators.The recall for all scenarios increases with each subsequent recommendation.The graph also indicates that after the first few recommendations, the precision ERGM approach began to increase more rapidly than the other scenarios.Figure 6 demonstrates the recall for each scenario.The x-axis shows the number of recommended collaborators.The recall for all scenarios increases with each subsequent recommendation.The graph also indicates that after the first few recommendations, the precision ERGM approach began to increase more rapidly than the other scenarios.The evaluation produced mixed results for old users; the ERGM approach enhanced recommendation accuracy, but only after a few faulty recommendations.The reason behind these mixed results for older users is that potentially useful information, such as current collaborations, is not taken into consideration when generating recommendations for older users.However, evidence from the literature suggests that existing collaborations affect future collaborations [37].Additionally, best practices for scientific collaboration state that closing triangles (i.e., collaborating with one's collaborators' collaborators) is important [38].

New Users
New users include both users who have just joined CCIS and those whose past collaboration data are unavailable.In many RSs, these users suffer a cold-start problem, which arises from the fact that there is no previously recorded interaction for these users.Figure 8 displays precision for all three scenarios.The ERGM scenario generates the highest precision for new users, while the equal scenario results in higher precision than the random scenario.The evaluation produced mixed results for old users; the ERGM approach enhanced recommendation accuracy, but only after a few faulty recommendations.The reason behind these mixed results for older users is that potentially useful information, such as current collaborations, is not taken into consideration when generating recommendations for older users.However, evidence from the literature suggests that existing collaborations affect future collaborations [37].Additionally, best practices for scientific collaboration state that closing triangles (i.e., collaborating with one's collaborators' collaborators) is important [38].

New Users
New users include both users who have just joined CCIS and those whose past collaboration data are unavailable.In many RSs, these users suffer a cold-start problem, which arises from the fact that there is no previously recorded interaction for these users.Figure 8 displays precision for all three scenarios.The ERGM scenario generates the highest precision for new users, while the equal scenario results in higher precision than the random scenario.The evaluation produced mixed results for old users; the ERGM approach enhanced recommendation accuracy, but only after a few faulty recommendations.The reason behind these mixed results for older users is that potentially useful information, such as current collaborations, is not taken into consideration when generating recommendations for older users.However, evidence from the literature suggests that existing collaborations affect future collaborations [37].Additionally, best practices for scientific collaboration state that closing triangles (i.e., collaborating with one's collaborators' collaborators) is important [38].

New Users
New users include both users who have just joined CCIS and those whose past collaboration data are unavailable.In many RSs, these users suffer a cold-start problem, which arises from the fact that there is no previously recorded interaction for these users.Figure 8 displays precision for all three scenarios.The ERGM scenario generates the highest precision for new users, while the equal scenario results in higher precision than the random scenario.

Comparison with Other Methods
We compared the performance of ERGM with COCOON CORE [23].COCOON makes recommendations by combining two collaboration factors (similarity and betweenness) and asking users to adjust the weights of both factors.We used equal weight for both similarity and betweenness (50% value).
We conducted a set of experiments using 26 users from their actual 2014 Scopus collaboration data.For each user we recommended three collaborators using two methods, ERGM and COCOON, and we compared the results of both approaches with the actual collaboration data.

Comparison with Other Methods
We compared the performance of ERGM with COCOON CORE [23].COCOON makes recommendations by combining two collaboration factors (similarity and betweenness) and asking users to adjust the weights of both factors.We used equal weight for both similarity and betweenness (50% value).
We conducted a set of experiments using 26 users from their actual 2014 Scopus collaboration data.For each user we recommended three collaborators using two methods, ERGM and COCOON, and we compared the results of both approaches with the actual collaboration data.

Comparison with Other Methods
We compared the performance of ERGM with COCOON CORE [23].COCOON makes recommendations by combining two collaboration factors (similarity and betweenness) and asking users to adjust the weights of both factors.We used equal weight for both similarity and betweenness (50% value).
We conducted a set of experiments using 26 users from their actual 2014 Scopus collaboration data.For each user we recommended three collaborators using two methods, ERGM and COCOON, and we compared the results of both approaches with the actual collaboration data.Figure 11 shows ERGM outperforming COCOON.The precision rate of ERGM is 36.1%, in comparison with 12.5% for COCOON.The recall rate of ERGM is 24.3%, which is higher than the recall rate of 8.4% with COCOON.Additionally, the F 1 of ERGM is 29.1%, which is higher than the 10.1% for COCOON.

Discussion and Research Limitations
Recommending collaborators involves different social and academic considerations.The complexity of the problem was addressed in this research by focusing on the primary issue of selecting weights for relevant contextual collaboration factors.A method to select and weight different social contextual factors was proposed using historical data and ERGM.The results indicated that using ERGM to weight contextual factors in hybrid RSs can increase recommendation accuracy, especially for new members.However, our work has several limitations.First, the main research area, as stated by faculty members, was used to represent research interest and similarity.This representation is limited and allows for an indication of only binary similarity (i.e., two members either do or do not have the same research area).Identifying the research similarity of members is a complex task, however, and has been the focus of many works that address the subject from different angles, such as topic modeling [39] and semantic analysis [40].
Second, the evaluation was set in the context of CCIS, but the approach can also be examined for other networks.However, building an ERGM model for large networks may require the use of statistical sampling techniques, such as snowballing (Pattison et al. [10]) to reduce computational complexity.In addition, the proposed method was evaluated on CCIS members using real data.This approach restricted the dataset size.The approach should be tested on additional data.Moreover, other evaluation approaches can be used such as user surveys to evaluate perceived accuracy.
In addition, the experiments indicated that the ERGM scenario outperforms other scenarios for new users across all measures.The equal scenario performs better than the random scenario for new users; however, mixed results were obtained for old users, implying that the ERGM scenario outperforms other scenarios only after making some inaccurate recommendations.This outcome suggests that this study's approach might be advantageously mixed with other approaches for old users.In addition, the outcome speaks to the usefulness of the approach for new users in cold-start situations [41].

Discussion and Research Limitations
Recommending collaborators involves different social and academic considerations.The complexity of the problem was addressed in this research by focusing on the primary issue of selecting weights for relevant contextual collaboration factors.A method to select and weight different social contextual factors was proposed using historical data and ERGM.The results indicated that using ERGM to weight contextual factors in hybrid RSs can increase recommendation accuracy, especially for new members.However, our work has several limitations.First, the main research area, as stated by faculty members, was used to represent research interest and similarity.This representation is limited and allows for an indication of only binary similarity (i.e., two members either do or do not have the same research area).Identifying the research similarity of members is a complex task, however, and has been the focus of many works that address the subject from different angles, such as topic modeling [39] and semantic analysis [40].
Second, the evaluation was set in the context of CCIS, but the approach can also be examined for other networks.However, building an ERGM model for large networks may require the use of statistical sampling techniques, such as snowballing (Pattison et al. [10]) to reduce computational complexity.In addition, the proposed method was evaluated on CCIS members using real data.This approach restricted the dataset size.The approach should be tested on additional data.Moreover, other evaluation approaches can be used such as user surveys to evaluate perceived accuracy.
In addition, the experiments indicated that the ERGM scenario outperforms other scenarios for new users across all measures.The equal scenario performs better than the random scenario for new users; however, mixed results were obtained for old users, implying that the ERGM scenario outperforms other scenarios only after making some inaccurate recommendations.This outcome suggests that this study's approach might be advantageously mixed with other approaches for old users.In addition, the outcome speaks to the usefulness of the approach for new users in cold-start situations [41].

Conclusions
In this paper, a method was proposed for a hybrid collaborator recommender system to weigh different social context factors using historical data and ERGM.Results indicate that using ERGM to weight social context factors increases recommendation accuracy, especially for new members.
As a future scope of this work, we plan to assess our method using additional datasets that include different attributes.Furthermore, we plan to extend our model in order to include varying degrees of research similarity.

Figure 4 .
Figure 4. Collaboration network within College of Computer and Information Sciences (CCIS) in 2013.

Figure 4 .
Figure 4. Collaboration network within College of Computer and Information Sciences (CCIS) in 2013.

Group 1 -
Old: Old users are faculty members who collaborated in 2013 and 2014.• Group 2-New: New users consist of new members who joined the CCIS network in 2014.Three different scenarios for each group were generated to demonstrate the value of identifying relevant contextual factors and their weights: • Scenario 1-ERGM: This scenario uses weights for contextual factors calculated using the ERGM.• Scenario 2-Equal: This scenario considers equal weights for all contextual factors.• Scenario 3-Random: This scenario uses random weights for contextual factors.

Figure 7 illustrates
Figure 7 illustrates  for each scenario.The x-axis shows the number of recommended collaborators.The graph indicates that after the first few recommendations, the  score for the ERGM approach increases recommendation precision (Scenario 1).

Figure 6 16 Figure 5 .
Figure6demonstrates the recall for each scenario.The x-axis shows the number of recommended collaborators.The recall for all scenarios increases with each subsequent recommendation.The graph also indicates that after the first few recommendations, the precision ERGM approach began to increase more rapidly than the other scenarios.

Figure 7 illustrates
Figure 7 illustrates  for each scenario.The x-axis shows the number of recommended collaborators.The graph indicates that after the first few recommendations, the  score for the ERGM approach increases recommendation precision (Scenario 1).

Figure 7
Figure 7 illustrates F 1 for each scenario.The x-axis shows the number of recommended collaborators.The graph indicates that after the first few recommendations, the F 1 score for the ERGM approach increases recommendation precision (Scenario 1).

Figure 8 .
Figure 8. Precision for new users.

Figure 8 .
Figure 8. Precision for new users.

Figure 8 .
Figure 8. Precision for new users.

Figure 9 16 Figure 9
Figure 9 illustrates recall for the three scenarios.The ERGM scenario increases the recall for new users more than for the other scenarios.

Figure 9 .
Figure 9. Recall for new users.

Figure 10 illustrates
Figure 10 illustrates  for the three scenarios.The ERGM scenario increases  for new users more than the other scenarios do, while the random scenario results in the worst performance.

Figure 9 .
Figure 9. Recall for new users.

Figure 10 16 Figure 9
Figure 10 illustrates F 1 for the three scenarios.The ERGM scenario increases F 1 for new users more than the other scenarios do, while the random scenario results in the worst performance.

Figure 9 .
Figure 9. Recall for new users.

Figure 10 illustrates
Figure 10 illustrates  for the three scenarios.The ERGM scenario increases  for new users more than the other scenarios do, while the random scenario results in the worst performance.

Table 3 .
The goodness of fit (GOF) values.
Information 2019, 10, x FOR PEER REVIEW 13 of 16Figure11shows ERGM outperforming COCOON.The precision rate of ERGM is 36.1%, in comparison with 12.5% for COCOON.The recall rate of ERGM is 24.3%, which is higher than the recall rate of 8.4% with COCOON.Additionally, the  of ERGM is 29.1%, which is higher than the 10.1% for COCOON.