New Developer Metrics for Open Source Software Development Challenges: An Empirical Study of Project Recommendation Systems

Şeker, Abdulkadir; Diri, Banu; Arslan, Halil

doi:10.3390/app11030920

Open AccessArticle

New Developer Metrics for Open Source Software Development Challenges: An Empirical Study of Project Recommendation Systems

by

Abdulkadir Şeker

^1,2,*

,

Banu Diri

³

and

Halil Arslan

¹

Department of Computer Engineering, Sivas Cumhuriyet University, 58140 Sivas, Turkey

²

Renewable Energy Research Center, Sivas Cumhuriyet University, 58140 Sivas, Turkey

³

Department of Computer Engineering, Yıldız Technical University, 34349 İstanbul, Turkey

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2021, 11(3), 920; https://doi.org/10.3390/app11030920

Submission received: 28 November 2020 / Revised: 6 January 2021 / Accepted: 15 January 2021 / Published: 20 January 2021

(This article belongs to the Special Issue Knowledge Retrieval and Reuse Ⅱ)

Download

Browse Figures

Review Reports Versions Notes

Abstract

:

Software collaboration platforms where millions of developers from diverse locations can contribute to the common open source projects have recently become popular. On these platforms, various information is obtained from developer activities that can then be used as developer metrics to solve a variety of challenges. In this study, we proposed new developer metrics extracted from the issue, commit, and pull request activities of developers on GitHub. We created developer metrics from the individual activities and combined certain activities according to some common traits. To evaluate these metrics, we created an item-based project recommendation system. In order to validate this system, we calculated the similarity score using two methods and assessed top-n hit scores using two different approaches. The results for all scores with these methods indicated that the most successful metrics were binary_issue_related, issue_commented, binary_pr_related, and issue_opened. To verify our results, we compared our metrics with another metric generated from a very similar study and found that most of our metrics gave better scores that metric. In conclusion, the issue feature is more crucial for GitHub compared with other features. Moreover, commenting activity in projects can be equally as valuable as code contributions. The most of binary metrics that were generated, regardless of the number of activities, also showed remarkable results. In this context, we presented improvable and noteworthy developer metrics that can be used for a wide range of open-source software development challenges, such as user characterization, project recommendation, and code review assignment.

Keywords:

developer metric; open source; project recommendation system; GitHub; issue; pull request; commit

MSC:

68N30

1. Introduction

Thanks to the increasing capabilities of open source software (OSS) development tools, the number of open-source users and projects is growing each year. Existing software collaboration platforms include millions of developers with different characters and skill sets, as well as a wide variety of projects that offer solutions to various problems. GitHub, is the largest one among these platforms, that hosting more than 40 million repositories to which over 100 million developers have contributed. On this platform, several features are used to manage the distributed and open-source projects of which the most widely used are issues, commits, and pull requests (PRs). Activities related to these features—such as opening an issue, merging a PR, or commenting on a commit—can provide information about the developers and projects. The knowledge obtained from this information can then be used in the form of developer metrics to solve various software engineering challenges.

The some of the developer metrics used are the number of lines of code in a project, developer degrees of connection to one another, past experience, or common features (e.g., nationality, location, occupation, gender, previously used programming languages). These metrics offer solutions to different problems within OSS development and distributed coding, including automatic assignments (task, issue, bug, or reviewer) [1,2,3], project recommendation systems [4,5], and software defect detection [6].

However, on platforms with such a large amount of data, it is challenging for developers to find similar projects to their own, identify projects of interest, and reach projects to which they can contribute. As developers primarily use search engines or in-platform search menus to find projects, the constraints of text-based search [7] and challenges related to finding the correct keywords also lead them to miss some projects [8]. While various project recommendation systems are being developed to overcome this problem, projects must be rated by users if recommendation models are to work properly. In the same way that viewers give ratings to movies that they have watched, developers need to rate the projects in which they are interested. However, this does not currently occur on software collaboration platforms, meaning there is no labeled dataset for work on this problem. For this reason, several developer metrics that can be extracted from the activity or features of both developers and projects can be used to calculate the score that a user gives projects.

In this study, new developer metrics are presented to be used for a variety of open source distributed software development challenges. We developed a project recommendation system due to evaluate these metrics using data from GitHub with the aim of making recommendations to developers based on their GitHub activities. Moreover, to address the sparsity problem on GitHub, we selected a dataset with a high project-user ratio. Despite this handicap, we obtained remarkable results in comparison with a similar study [5]. We present the following research questions for this study:

RQ1.: Can we offer new evaluation methods for GitHub project recommendation problem?
RQ2.: Can we use the activities of GitHub users as a developer metric individually?
RQ3.: Can new metrics be obtained by combining similar properties or activities?
RQ4.: Is a user’s amount of activity important in the context of developer metrics?

In light of these research questions, we organized this paper as follows. In the background section, we discuss the literature on previously proposed developer metrics that have been used for a wide range of OSS development challenges. In the following section, we describe our dataset, proposed metrics, and project recommendation model. We evaluate the metrics using different approaches and methods in the last section.

2. Background

The activities of developers in platforms such as GitHub provide collaboration, learning, and reputation management in the community [9]. These activities are the metadata of the platforms which directly correlated to reputation or performance of the developers [10]. Besides, the developer metrics are also created from these activities. In general, the metrics are related to the features of distributed code development processes such as issue, commit, and PR. We present some metrics that have been discussed in the literature in terms of these features and explain the challenges on which studies using these metrics have focused.

PR allows users to inform others about changes they have pushed to a branch in a repository on GitHub. PRs are a key feature when contributing code by different developers to a single project [11]. The proposed metrics related to this feature are used to solve different PR-related problems. PRs need to be reviewed (by a reviewer) in order to merge projects. If the result of a review is positive, the PR is integrated into the master branch. Finding the correct reviewer is thus an important parameter for ensuring rapid and fair PR revisions. Different metrics have been used in this context to address the problem of automatic PR reviewer assignment. The existing literature has proposed various metrics to solve this problem, including PR acceptance rate within a project, active developers on a project [3], PR file location [12], pull requesters, social attributes [13], and textual features of the PR [14], among others. Closing a PR with an issue, PR age, and mentioning (@) a user in the PR comments have all been used to determine the priority of a PR [15]. Cosentino proposed three developer metrics (community composition, acceptance rates, and becoming a collaborator) to investigate project openness and stated that project owners could evaluate the attractiveness of their projects using these metrics [16].

Developer metrics are also used for software defect detection. In one study, defects were estimated using different metrics grouped by file and commit level. The number of files belonging to a commit, the most modified file of all files in a commit, the time between the first and last commit, and the experience of a given developer on a committed file were identified as important metrics [6].

Reliability metrics are used to quantitatively express the software product reliability [17]. Metrics used to measure reliability in OSS include number of contributors, number of commits, number of lines in commits, and certain metrics derived from them. Tiwari proposed two important metrics for reliability, namely contributors and number of commits per 1 K code lines [18].

Code ownership metrics are also important problem for OSS. One study used the modified (touched) number of files to rank developer ownership according to code contributions [19]. In another study, researchers used the number of changed lines in a file (churn) to address this problem [20]. Foucault confirmed the relationship between these code ownership metrics and software quality [21].

Recommender systems are an important research topic in software engineering [22,23]. In ordinary recommendation models, previously known user-item matrices are used. In other words, the rating given by a user for an item is known. In such cases, the essential research topic involves using different algorithms and models to estimate the rating that the user has already given [24]. However, this differs on software collaboration platforms like GitHub. Considering the developer as the user and the project (repository or repo) as the item, the rating that a developer gives to a project is unknown. In this context, the first problem that must be solved is how to create an accurate developer-project matrix. At this point, different developer metrics come into play.

In a study aiming to predict whether a user would join a project in the future, the metrics used included a developer’s GitHub age (i.e., when their account was opened), the number of projects that they had joined, the programming languages of their commits, how many times a project was starred, the number of developers that joined a project, and the number of commits made to a project [25].

In another study that explored the factors that led a user to join a project, the metrics used included a developer’s social connections, programming with a common language, and contributions to the same projects or files [26]. Liu et al. designed a neural network-based recommendation system that used metrics such as working at the same company, previous collaboration with the project owner, and different time-related features of a project [27].

Sun et al. relied on basic user activity to develop a project recommendation model using GitHub data. Specifically, when rating a project for a developer, they used "like-star-create" activities related to projects [5].

To sum up, some developer metrics that were noted as considerable in their papers are given in Table 1. In this table, we have also presented the fields of the metrics (the related feature), their definitions, and the target challenge, along with the reference studies. These metrics have been used to address various challenges in OSS development, among them project characterization, reviewer assignment for issues or PRs, and project recommendation are prominent ones. Generally, researchers have aimed to use these metrics to characterize developers by analyzing their activities in order to solve a problem [28]. Thus, developer metrics become crucial factors for solving these challenges. In this study, we offer a number developer metrics that can be used for a variety of challenges.

3. Research Design

3.1. Dataset

One of the most serious challenges in developing a recommender system is sparsity [33], a problem that occurs when most users rate only a few items [34]. This issue is also present on GitHub, as it is not possible for developers to be aware of most of the millions of repositories on the platform. Most of the studies mentioned in the previous section used limited (less sparse) and ad hoc (unpublished) datasets (Table 2). In Table 2, we present the number of users, the number of projects, and the ratio of them (user/project or project/user) in the datasets of ours and others. Higher ratios indicates sparsity in the dataset. The greater a dataset’s sparsity, the more difficult it is to make the correct recommendations. Thus, the results of the related studies are controversial in terms of real platform data (because of working on a smaller dataset). Therefore, in this study, we used a public dataset called GitDataSCP (https://github.com/kadirseker00/GitDataSCP) that is reflective of the sparsity problem inherent in the nature of GitHub [35].

The dataset contained data related to 100 developers and 41,280 projects (repositories). The creators of the dataset indicated that they selected the most active users on the platform and extracted some related data from the activities of these GitHub users (Figure 1).

The number of records in the dataset is also given in Figure 1 (below the name of collections). The Repos and all other collections include records that related to the Users collection. All details regarding the creation of the dataset are provided in the source study of the dataset [35].

3.2. Recommendation Model

We created a recommendation model based on item-item collaborative filtering. The collaborative filtering method is usually involves gathering and analyzing information about a user’s behavior, activities, or preferences and predicting what they will like based on their similarity to other users. Collaborative filtering is based on the assumption that different individuals who have had similar preferences in the past will make the same choices in the future. For example; if a user named John prefers A, B, and C products, and a user named Alice prefers B, C, and D products, it is likely that John will also prefer product D and Alice will prefer product A. Item-item logic can thus recommend similar items to a user who consumes any item.

Constructing a recommendation model for GitHub is different from classic (movie) principles. When recommending a movie to a user, the ratings of movies are known. However, this is not the case on GitHub, where developers do not rate projects. In this context, it is thus necessary to determine a metric that can be used as a rating. In addition, the model should recommend projects with which a developer is unfamiliar (just as a movie recommender should suggest movies that the user has not watched). We constructed a model taking into account these conditions.

A project recommender system for software collaboration development platforms includes stages as generating a project-developer rating matrix according to a certain metric, finding similarities between projects, generating recommendations, and evaluating results. First, we generated a project-developer rating matrix (Table 3) using specific metrics. As seen the Table 3, columns represent the users, rows represent the projects (repos). We selected a metric, then, input the metric’s values (quantity, ratio, or binary) into the cells. For example, as shown in Table 3a, User-1 opened seven issues in Project-2.

We normalized the values of the project-developer matrix using min-max normalization. The values of the metrics were scaled from 0 to 10. As in the movie-user model, we assumed that each developer gave a rating (0–10) to each project (Table 3b). First, we calculated similarity scores between projects using two methods (cosine and TF-IDF similarity). We then recommended the top-n projects to each developers. Finally, we evaluated the correct recommendations using two evaluation methods (community relation and language experience).

3.3. Similarity Methods

First, we found similar projects to recommend to developers. The memory-based collaborative filtering recommendation system selects a set of similar item neighbors for a given item [36]. Therefore, we used two memory based approaches.

3.3.1. Cosine Similarity (Context Free)

For our first method, we choose a method without textual features. We used the cosine similarity to calculate the similarity score of the two projects (with explicit rating scores).

A project vector consists of related rows from the project-developer matrix (Equation (1)).

${\vec{P}}_{m e t r i c X} (i) = (u_{1}, u_{2}, u_{3}, \dots, u_{100})$

(1)
To calculate the similarity score between Project-i (P_i) and Project-j (P_j) according to the metricX rating, we used the Equation (2).

$s i m i l a r i t y_{(P_{i}, P_{j})}^{c o n t e x t_f r e e} = cos ({\vec{P}}_{m e t r i c X} (i), {\vec{P}}_{m e t r i c X} (j)) = \frac{{\vec{P}}_{m e t r i c X} (i) \cdot {\vec{P}}_{m e t r i c X} (j)}{∥ {\vec{P}}_{m e t r i c X} (i) ∥ * ∥ {\vec{P}}_{m e t r i c X} (j) ∥}$

(2)

3.3.2. TF-IDF Similarity (Context Based)

For our second method, we used text-based similarity. We used all of the comments in a project to (implicitly) calculate the similarity score between projects. In Figure 2, we showed the most frequent words in the comments in the dataset to provide a rough understanding of the corpus.

We obtained all comments that belonging to commits, issues, and PRs from all projects. Table 4 presents samples from issue comments.
We applied text preprocessing to these documents.
- Convert all text into lower case.
- Remove all digits.
- Remove all punctuations.
- Remove stopwords.
- Remove extra whitespace.
We grouped all comments by projects and merged all comments into a single field (using the comments column of Table 5).We thus aimed to generate one comment document for each project.
The documents we generated were too large to be processed by an ordinary PC (The average word count for each project’s issue comments was approximately 11,000). Therefore, we decided to select the most n frequent words for each comment. We applied Zipf’s law to determine the cutoff point (n) [37]. As seen in Figure 3, we selected words with a rank greater than 100 and right of the second knee (function words). Thus, we generated documents for all projects that each included a maximum of 100 words.
We calculated the TF-IDF similarity scores for all projects with the documents generated in the previous step using the Equations (3) and (4).

$t f i d f_{(t, d, P r o j e c t_{i})} = t f (t, P r o j e c t_{i}) * i d f (t, P r o j e c t_{i})$

(3)

$s i m i l a r i t y_{(P_{i}, P_{j})}^{c o n t e x t_b a s e d} = t f i d f (P r o j e c t_{i}) * t f i d f (P r o j e c t_{j})$

(4)

3.3.3. Handling Unknown Projects

After we calculated similarities using above two methods, we generated ratings of unrated (unknown) projects for each developer. The similarity between unknown projects (We assumed that unknown projects were those to which developers had no relationship and had made no contributions.) and rated projects was used to calculate the rating of known projects [5]. We calculated the rating of an unknown project using the dot product of the similarity values between the projects that the user rated and the unknown project (Equation (5)). An example scenario involving this calculation is presented in Figure 4.

u n k n o w n_{r a t i n g} = \sum_{n = 0}^{i} k n o w n_{r a t i n g}^{i} * s i m i l a r i t y_{k n o w n^{i}, u n k n o w n}

(5)

3.4. Evaluation Methods

RQ1:: Can we offer new evaluation methods for GitHub project recommendation problem?

We recommended the top-n highest-rated projects among the unknown projects to each developer. We then evaluated the recommendations using two methods we proposed. When recommending projects to developers, there should be a ground truth for evaluating the proposed projects. Unlike ordinary recommender systems, this is an unsupervised model. The evaluation criteria used in some studies related to this subject are set forth below.

A project-user rating matrix was split randomly into test and training subsets. Accuracy or recall scores were then calculated from the intersection between the top-n scores of the test and train subsets [5]. However, another study argued that this method should not be used on platforms like GitHub where time is an important parameter, pointing out that problems would arise regarding predicting past activity with future data will occur when using k-fold cross-validation by randomly dividing the data [3].
In another study, the accuracy of project recommendations was evaluated using the developer’s past commits to the related project. A recommendation was assumed to be correct if the number of commits a certain developer made to the project exceeded a certain value. The average number of commits per project was set as the threshold value in the dataset [27].
In a study predicting whether a developer would join a project in the future, the dataset was split into two different sets by time. In this way, the predicted result was verified with actual future data [25].
In a survey-based study, the authors asked respondents which features could be used as a recommendation tool. Most of them stated that the languages in which developers already coded or with which they were familiar were important for recommendations [38].

In this study, we used two evaluation methods to analyze our proposed developer metrics.

3.4.1. Community Relation Approach

First, with the community relation approach, we used GitHub’s watching and forking features as the ground truth. GitHub users can follow, or watch projects whose developments they want to monitor [39]. If a developer is watching a project, this indicates that he or she is interested in the project. Similarly, forking is used to contribute independently to the project of interest [40]. Developers usually make changes in their forked project (local branch) and can then send their contribution via PRs to base project (master branch). External developers mostly use the fork-pull mechanism to contribute to projects of interest. In this context, we believe that both of features are important for recommendations. While we used “watching or forking” as an evaluation criteria, we fine-tuned the criteria as detailed below.

The full name of a GitHub repository is created by concatenating the owner’s name (login) with the repository name (e.g., davidteather/handtracking). In analyzing our results, we noticed that the model recommended some project to developers that had only the correct owner login or repository name. In other words, the model suggested an incorrect project of the correct owner or the exact opposite. We evaluated these suggestions as a half point (0.5), as recommending only the correct owner to a developer will still allow the developer to to learn other projects by that owner. Similarly, if the model recommends only a correct repository name with an incorrect owner, this indicates that it has suggested the forked version of a correct repository. Thus, the related developer can discover with the master (base) repository.

An example scenario demonstrating this situation is given in Table 6. The projects recommended for Alice are listed in the first column Table 6a. Two of them are among the repos that Alice watches, with the two correct matchs, the initial score is 2. There are two repos by a developer named fengmk2 among Alice’s watched projects (fengmk2/parameter and fengmk2/cnpmjs.org) (Table 6b). The model suggested the project fengmk2/emoji that belongs to a developer who Alice is familiar with him. Similarly, Alice watches a forked project of visionmedia/co. Thus, two half-score are added to the initial score and 3: (2 + 0.5 + 0.5) is the final score.

We used the Equation (6) to calculate hit score (In other words, the scores of correct recommendations). Our analysis showed that some developers had only a few watched projects. Thus, the case of a developer interested in (watching or forking) fewer than n projects was considered in the updated score Equation (6).

h i t_{s c o r e}^{c o m m u n i t y} = \{\begin{matrix} 100 * \frac{h i t_{f u l l n a m e} + (h i t_{p a r t i a l} * 0.5)}{n}, & if n u m_{w a t c h_o r_f o r k} \geq n \\ 100 * \frac{h i t_{f u l l n a m e} + (h i t_{p a r t i a l} * 0.5)}{n u m_{w a t c h_o r_f o r k}}, & otherwise \end{matrix}

(6)

To sum up, in this evaluation approach, if the recommended project is among the developer’s watched or forked projects, the project is considered a hit.

3.4.2. Language Experience Approach

With our second approach, we wanted to benefit from the developer’s coding language experience. Thus, we processed the languages of projects and used the knowledge obtained as an evaluation criterion. The projects in our dataset included 94 unique coding languages. The most used 20 languages are shown in Figure 5. Due to the diversity of languages in the dataset, we believe that the language feature can be used as another evaluation criterion.

We extracted all programming languages for projects that developers owned or watched or to which they made commits (Table 7). We aimed to discover the languages in which a developer had any activity. We then sorted them by frequency and identified the three most used languages (“expert languages” column in Table 7).

Similarly, in this evaluation approach, if the recommended project’s language was among a developer’s top languages, the project was considered a hit. This evaluation’s scores were considerably higher than the first’s. However, our aim was to validate the significant metrics with another evaluation criterion.

h i t_{s c o r e}^{e x p e r i e n c e} = 100 * \frac{h i t_{l a n g u a g e}}{n}

(7)

In this way, we created a project recommendation model has been created for software collaboration platforms. The algorithm of the recommendation model is presented in Figure 6, starting with selecting a feature as a metric and ending with calculating hit scores.

4. Empirical Results

We generated 40 different developer metrics that provide information about a developer’s past activity on a project. All metrics used were scaled from 0 to 10 using the min-max normalization technique. The project-developer relationship was thus rated in the range of 0 to 10 (as with a viewer’s rating of a movie). We then applied all of these metrics to the project recommendation model and evaluated the results with the top 1, 3, 5, 10, and 20 recommendations hit scores.

4.1. Generating Developer Metrics

We created developer metrics using several methods. To extract metrics for a developer in a project, we used the number of activities, the ratio between some number of activities, and the status of whether an activity exists or not. First, using activities individually, we created metrics called single metrics. We then combined the single metrics according to common features to obtain the fusion metrics. Lastly, we created binary fusion metrics indicating whether a particular activity existed in a project.

4.1.1. Single Metrics

RQ2:: Can we use the activities of GitHub users as a developer metric individually?

Developer activity on projects was handled as a metric. Activity includes all kinds of comments, code contributions, revisions, and so on. In this section, all metrics were treated individually in order to evaluate the significance of each. These metrics refer to the number of activities per project for a given developer (Table 8).

In addition to these metrics, we calculated the optimized metrics from the values of single developer metrics. We named these metrics with the prefix ’O_’. For example, name of optimized pr_closed is O_pr_closed. We aimed to show the contribution ratio of each user for each project. For example, John closed 10 PRs in a projectA, and 10 in projectB. The total number of closed PRs is 100 in projectA and 1000 in projectB. Therefore, John contributed much more to projectA (10% contribution) than to projectB (1% contribution), as his closing PR ratio in projectA is higher. We calculated the rating with Equation (8).

O_a c t i v i t y_{x} = \frac{#_o f_a c t i v i t y_{x}^{u s e r}}{#_o f_t o t a l_a c t i v i t y_{x}^{p r o j e c t}}

(8)

For comparison purposes, we added another metric proposed by Sun et. al. They scored developers and projects using like, star, create activities and used textual data extracted from projects’ README and source code files to find project similarities [5]. Their dataset included approximately 22,000 repositories and 1700 developers. In our dataset, the ratio of number of developers to number of projects was approximately 1:400, in Sun et al.’s study it was 1:14. We also planned to use this less sparsed dataset to make a fair comparison but could not because the dataset was unshared. As we were unable to communicate with the authors, we applied a very similar rating to our dataset.

4.1.2. Fusion Metrics

RQ3:: Can new metrics be obtained by combining similar properties or activities?

In our results, we observed that some metric groups came to the forefront, especially issue related metrics. New metrics can be proposed by grouping comments, code contributions, or other common feature metrics. Fusion metrics were created from combinations of single metrics.

Sun’s metric: is mentioned previous section. It is metrics from similar study [5].
code_contributions: is created from the sum of all code contribution-related metrics. -4.6cm0cm

$\begin{matrix} c o d e_c o n t r i b u t i o n s = p r_o p e n e d + i s s u e_o p e n e d + i s s u e_h a s P R + p r_m e r g e d + c o m m i t_c o m m i t t e d \end{matrix}$

(9)
comments: is created from the sum of all comment-related metrics.

$\begin{matrix} c o m m e n t s = i s s u e_c o m m e n t e d + c o m m i t_c o m m e n t e d + p r_c o m m e n t e d \end{matrix}$

(10)
issue_related: is created from the sum of all issue-related metrics.

$\begin{matrix} i s s u e_r e l a t e d = i s s u e_o p e n e d + i s s u e_h a s P R + i s s u e_c o m m e n t e d + i s s u e_a s s i g n e d \end{matrix}$

(11)
pr_related is created from the sum of all PR-related metrics.

$\begin{matrix} p r_r e l a t e d = p r_o p e n e d + p r_m e r g e d + p r_c l o s e d + p r_a s s i g n e d \end{matrix}$

(12)
commit_related: is created from the sum of all commit-related metrics.

$\begin{matrix} c o m m i t_r e l a t e d = c o m m i t_c o m m e n t e d + c o m m i t_a u t h o r e d + c o m m i t_c o m m i t t e d \end{matrix}$

(13)
commit2comment is created from the (commit_committed divided by commit_commented)

$\begin{matrix} c o m m i t 2 c o m m e n t = \frac{c o m m i t_c o m m i t t e d}{c o m m i t_c o m m e n t e d} \end{matrix}$

(14)
issue2comment is created from the (issue_opened divided by issue_commented)

$\begin{matrix} i s s u e 2 c o m m e n t = \frac{i s s u e_o p e n e d}{i s s u e_c o m m e n t e d} \end{matrix}$

(15)
pr2comment is created from the (pr_opened divided by pr_commented)

$\begin{matrix} p r 2 c o m m e n t = \frac{p r_o p e n e d}{p r_c o m m e n t e d} \end{matrix}$

(16)
code2comment is created from the ratio of two fusion metrics (contribution divided by comment)

$\begin{matrix} c o d e 2 c o m m e n t = \frac{c o n t r i b u t i o n}{c o m m e n t} \end{matrix}$

(17)

4.1.3. Binary Fusion Metrics

RQ4:: Is a user’s amount of activity important in the context of developer metrics?

The above metrics offer information about how many activities were made. For instance, if John opened 18 issues in projectX, the John-projectX rating is 18. As an alternative, a set of metrics was created that simply showed whether a given activity existed. For instance, even so, if John opened an issue in projectX, the John-projectX rating is 1; if John did not open an issue in projectY, the John-projectY rating is 0. We created the binary metrics using the Equation (18) from the fusion metrics.

B i n a r y M e t r i c s = \{\begin{matrix} 1, & if S i n g l e M e t r i c > 0 \\ 0, & otherwise \end{matrix}

(18)

Because of the binary metrics consisted only of 0 s and 1 s, we did not used them directly. Instead, we created binary fusion metrics. We named these metrics with the prefix ‘binary_’; for example, the name of the binary fusion metric for comments is binary_comments. from the binary metrics using the same equations while creating fusion metrics from the single metrics.

4.2. Project Recommendation Results

In this section, we showed the only top 1, 5, and 10 hit scores of the most significant metrics. Most of the ratio-based fusion metrics and optimized single metrics were very weak in all cases. Therefore, we removed them from the score tables below. The results of all metrics according to all n values are provided in the Appendix A.

We used two similarity metrics to calculate project’s similarity scores. To evaluate the accuracy of developer metrics, we used from two approaches. Thus, we give four results for the recommendation system with the combinations of similarity methods and evaluation approaches (Figure 7).

After we generated these single, fusion, and binary fusion metrics, we applied all developer metrics to the model. We presented the hit scores percentiles that were obtained according to the two evaluation approaches (detailed in the Section 3.4 ) in Table 9 and Table 10.

In these tables, the columns represent the top-n hit scores in percentiles, and the green (1st), blue (2nd), and red (3rd) cells show the leading metrics in each top-n scores. In addition, we styled the most successful five metrics according to overall scores in bold. We used the mean reciprocal rank (MRR) (It is commonly used in question-answering systems. Here, we used number of models instead of number of queries.) evaluation method (Equation (19)) to calculate this overall score where n represents the number of models created (n:6).

s c o r e_{M R R} = \frac{1}{n} \sum_{i = 0}^{n} \frac{1}{r a n k_{i}}

(19)

For example, as shown in Table 9, the MRR value of comments is calculated as in Equation (20). The metric ranks for all metrics were used in each of the six models.

s c o r e_{M R R}^{c o m m e n t s} = \frac{1}{6} * (\frac{1}{2} + \frac{1}{1} + \frac{1}{2} + \frac{1}{15} + \frac{1}{16} + \frac{1}{18}) = 0.36

(20)

In addition to Table 9 and Table 10, we presented all top-n hit scores charts of the most successful five metrics (+1 Sun_metric) for each methods in Figure 8 and Figure 9.

Table 9 shows the results of hit scores using the context-free and context-based similarity methods as evaluated using the community relation approach. When the results were analyzed, the pr_opened metric was the most successful according to MRR scores. As PRs indicate projects to which a developer contributed directly, we believe that modeling based on PR creation in a given project increases the success of the project recommendation system. In addition, the fusion metrics comment, binary_pr_related, binary_issue_related, and binary_comment also attracted our attention, as these metrics all related to commenting activity. The results therefore indicate the importance of discussion on the collaboration platforms.

Table 10 shows the results of hit scores according to the context-free and context-based similarity methods as evaluated using the language experience approach. This approach elicited higher hit scores than the first approach as we expected. (In the community relation approach, the model recommends the top-n projects of the approximately 40,000 projects. On the contrary, the language experience approach makes only recommendation from 94 distinct language projects). However, it is important that evaluate the success rate of the two approaches independently. The binary fusion metrics clearly stand out and most fusion metrics gave better hit scores than single metrics. In this context, because a fusion metric represent many features about a developer, we believe that projects developed in the languages in which the developer specializes are better explored. In addition, in this approach, the binary_issue_related metric leads among the fusion metrics. The issue that is the primary function of project management operations is the most crucial feature in terms of developer-project correlation. Lastly, the results of Sun_metric drew attention here, unlike with the previous approach.

Analyzing both Table 9 and Table 10 together, we saw that binary metrics were quite successful. It is also a noteworty that comment-based metrics achieved higher scores than code activity-based metrics.

4.3. Threats to Validity

The scope of this study was limited to active developers on GitHub. The first challenge is whether these metrics will work well for inactive developers, a case resembling the cold start problem in classic recommender systems. It is even more difficult to make recommendations for inactive developers due to the comparative lack of information about them. Our proposed metrics must therefore be analyzed in other datasets that include such developers.

The project recommendation problem is important for collaboration platforms. GitHub has recently started to offer project recommendations on the “Explore” page based on certain user activities. Apart from the metrics we proposed, metrics applied to other challenges can successfully be used for the recommendation problem. We encourage researchers to work on this problem using different metrics.

Another problem involves studying private datasets for software engineering challenges. Making comparisons to studies that use different datasets can be challenging. In this sense, our results are limited to our own dataset (which is public). Finally, unlike classic recommender systems, there are no labeled data (ground truth) for our problem. For this reason, we consider it important to create a labeled dataset that can be used to work on the project recommendation systems for platforms such as GitHub.

5. Conclusions

We extracted different types of metrics using the number of activities, the ratio of some metrics, and only the case of whether activity exists. To evaluate our metrics, we developed a top-n project recommender system based on collaborative filtering using item similarity logic which finds items similar to those with which a user has already interacted (e.g., liking, disliking, or rating). In our study, an interaction with an item refers to a contribution to a project. We used two different methods to calculate the similarity between projects. The context-based similarity method had a positive impact on the hit scores. We then evaluated the accuracy of metrics with two particular approaches.

In a movie recommender system, the ground truth is users’ actual ratings. However, there is no common evaluation baseline for GitHub project recommendation systems. Accordingly, in this paper, we proposed two approaches—community relation and language experience—as ground truth. The community relation approach checks whether a developer has watched (or forked) a given project, while the language experience approach uses as a baseline whether the language of the project is one in which the developer has previous experience.

First, we extracted developer metrics for individual activities. Among these single metrics, the most prominents were pr_opened, issue_hasPR, and issue_commented. In this context, we believe that these metrics are adequate even when used individually to obtain knowledge about developers. The crucial single metrics have common traits (such as the issue feature or commenting activity). Next, we created some fusion metrics by combining single metrics. Of these fusion metrics, the comments metric produced significant results. Lastly, as we were curious about whether the amount of activity was important in the context of developer metrics, we created the binary fusion metrics based on the case of activities existence. Taking all results together, the peak scores were gained from these metrics. In particular, the binary_issue_related and binary_comments were the most attention-grabbing metrics.

As PRs indicate projects to which a developer has contributed directly, we believe that modeling based on PR creation in a given project increases the success of the project recommendation system. The issue that is the primary function of project management operations is the most crucial feature in developer–project correlation.

Our results indicate that quantity is not a crucial parameter for some metrics. For example, the issue- and PR-related metrics are quantity-free metrics. Effectively, this means that, when using these metrics, it is sufficient to know whether the feature in question is present. Even if a developer contributes to only one issue on a project, the relation between the developer and the project is tight. In this regard, it is revealing that this issue was a significant feature for collaboration platforms.

In conclusion, we have proposed remarkable and improvable developer metrics based on user activities in GitHub. In particular, we found that commenting on any feature was as important as code contributions. Issue-related activities were also highly important in developer metrics.

We took into consideration the challenge of the sparsity inherent in the nature of GitHub. Despite the sparsity problem, our hit scores were notable compared with a similar study, with most of our metrics more successful than their metric.

Finding similarities between documents is very difficult. In future research, we plan to use word embedding (e.g., word2vec, GloVe, etc.) methods instead of TF-IDF. We plan to apply the proposed metrics to different datasets for validation purposes. We are curious about why some of the new developer metrics we presented became prominent. In light of this study, we are planning another study involving a survey of junior and senior developers whom we can contact to understand the ground truth of our metrics’ success (especially metrics related to commenting activities). In addition, we plan to apply these metrics to solve various problems. For instance, many developers, in addition to owners and collaborators, can make contributions to projects thanks to the open-source nature of GitHub. On some projects, external developers even contribute more than the core team. These metrics can reveal developers’ contribution rankings on a particular project.

Author Contributions

Conceptualization, A.Ş.; Data curation, H.A.; Formal analysis, A.Ş. and B.D.; Funding acquisition, A.Ş.; Methodology, A.Ş., B.D. and H.A.; Project administration, B.D.; Resources, A.Ş. and H.A.; Supervision, H.A.; Validation, A.Ş. and H.A.; Visualization, A.Ş.; Writing—original draft, A.Ş. and B.D.; Writing—review & editing, B.D. and H.A. https://www.mdpi.com/data/contributor-role-instruction.pdf (CRediT taxonomy) for the term explanation. Authorship must be limited to those who have contributed substantially to the work reported. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Table A1. All developer metrics all scores to evaluate with the community approach (context-free).

Table A2. All developer metrics all scores to evaluate with the community approach (context-based).

Table A3. All developer metrics all scores to evaluate with the experience approach (context-free).

Table A4. All developer metrics all scores to evaluate with the experience approach (context-based).

References

De Lima, M.L.; Soares, D.M.; Plastino, A.; Murta, L. Developers assignment for analyzing pull requests. In Proceedings of the ACM Symposium on Applied Computing; Association for Computing Machinery: New York, NY, USA, 2015; pp. 1567–1572. [Google Scholar] [CrossRef]
Badashian, A.S.; Hindle, A.; Stroulia, E. Crowdsourced bug triaging. In Proceedings of the 2015 IEEE 31st International Conference on Software Maintenance and Evolution, ICSME 2015—Proceedings, Bremen, Germany, 27 September–3 October 2015; pp. 506–510. [Google Scholar] [CrossRef]
Júnior, M.L.D.L.; Soares, D.M.; Plastino, A.; Murta, L. Automatic assignment of integrators to pull requests: The importance of selecting appropriate attributes. J. Syst. Softw. 2018, 144, 181–196. [Google Scholar] [CrossRef]
Zhang, L.; Zou, Y.; Xie, B.; Zhu, Z. Recommending relevant projects via user behaviour: An exploratory study on Github. In Proceedings of the 1st International Workshop on Crowd-Based Software Development Methods and Technologies, CrowdSoft 2014—Proceedings, Hong Kong, China, 17 November 2014; Association for Computing Machinery, Inc.: New York, NY, USA, 2014; pp. 25–30. [Google Scholar] [CrossRef]
Sun, X.; Xu, W.; Xia, X.; Chen, X.; Li, B. Personalized project recommendation on GitHub. Sci. China Inf. Sci. 2018, 61, 1–14. [Google Scholar] [CrossRef] [Green Version]
Ozcan Kini, S.; Tosun, A. Periodic developer metrics in software defect prediction. In Proceedings of the 18th IEEE International Working Conference on Source Code Analysis and Manipulation, SCAM 2018, Madrid, Spain, 23–24 September 2018; pp. 72–81. [Google Scholar] [CrossRef]
McMillan, C.; Grechanik, M.; Poshyvanyk, D. Detecting similar software applications. In Proceedings of the International Conference on Software Engineering, Zurich, Switzerland, 2–9 June 2012; pp. 364–374. [Google Scholar] [CrossRef] [Green Version]
Hu, J.; Sun, X.; Lo, D.; Li, B. Modeling the evolution of development topics using Dynamic Topic Models. In Proceedings of the 2015 IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering, SANER 2015—Proceedings, Montreal, QC, Canada, 2–6 March 2015; pp. 3–12. [Google Scholar] [CrossRef] [Green Version]
Dabbish, L.; Stuart, C.; Tsay, J.; Herbsleb, J. Social coding in GitHub: Transparency and collaboration in an open software repository. In Proceedings of the ACM Conference on Computer Supported Cooperative Work, CSCW, Bellevue, DC, USA, 11–15 February 2012; ACM Press: New York, NY, USA, 2012; pp. 1277–1286. [Google Scholar] [CrossRef]
Alarcon, G.M.; Gibson, A.M.; Walter, C.; Gamble, R.F.; Ryan, T.J.; Jessup, S.A.; Boyd, B.E.; Capiola, A. Trust Perceptions of Metadata in Open-Source Software: The Role of Performance and Reputation. Systems 2020, 8, 28. [Google Scholar] [CrossRef]
Gousios, G.; Pinzger, M.; Deursen, A.V. An exploratory study of the pull-based software development model. In Proceedings of the International Conference on Software Engineering, Shanghai, China, 18–21 August 2014; IEEE Computer Society: New York, NY, USA, 2014; pp. 345–355. [Google Scholar] [CrossRef] [Green Version]
Thongtanunam, P.; Tantithamthavorn, C.; Kula, R.G.; Yoshida, N.; Iida, H.; Matsumoto, K.I. Who should review my code? A file location-based code-reviewer recommendation approach for Modern Code Review. In Proceedings of the 2015 IEEE 22nd International on Software Analysis, Evolution, and Reengineering, SANER 2015—Proceedings, Montreal, QC, Canada, 2–6 March 2015; pp. 141–150. [Google Scholar] [CrossRef]
Tsay, J.; Dabbish, L.; Herbsleb, J. Influence of social and technical factors for evaluating contribution in GitHub. In Proceedings of the International Conference on Software Engineering, Shanghai, China, 18–21 August 2014; pp. 356–366. [Google Scholar] [CrossRef]
Yu, Y.; Wang, H.; Yin, G.; Ling, C.X. Reviewer recommender of pull-requests in GitHub. In Proceedings of the 30th International Conference on Software Maintenance and Evolution, ICSME 2014, Riva del Garda, Italy, 23–30 September 2014; pp. 609–612. [Google Scholar] [CrossRef]
Van Der Veen, E.; Gousios, G.; Zaidman, A. Automatically prioritizing pull requests. In Proceedings of the IEEE International Working Conference on Mining Software Repositories, Florence, Italy, 16–17 May 2015; pp. 357–361. [Google Scholar] [CrossRef] [Green Version]
Cosentino, V.; Izquierdo, J.L.C.; Cabot, J. Three Metrics to Explore the Openness of GitHub projects. arXiv 2014, arXiv:1409.4253. [Google Scholar]
Kaur, G.; Bahl, K. Software Reliability, Metrics, Reliability Improvement Using Agile Process. Int. J. Innov. Sci. Eng. Technol. 2014, 1, 143–147. [Google Scholar]
Tiwari, V.; Pandey, R. Open source software and reliability metrics. Int. J. Adv. Res. Comput. Commun. Eng. 2012, 1, 808–815. [Google Scholar]
Bird, C.; Nagappan, N.; Murphy, B.; Gall, H.; Devanbu, P. Don’t touch my code! Examining the effects of ownership on software quality. In Proceedings of the 19th ACM SIGSOFT Symposium on Foundations of Software Engineering; ACM Press: New York, NY, USA, 2011; pp. 4–14. [Google Scholar] [CrossRef]
Munson, J.C.; Elbaum, S.G. Code churn: A measure for estimating the impact of code change. In Proceedings of the International Conference on Software Maintenance, Bethesda, MD, USA, 16–19 November 1998; pp. 24–31. [Google Scholar] [CrossRef]
Foucault, M.; Teyton, C.; Lo, D.; Blanc, X.; Falleri, J.R. On the usefulness of ownership metrics in open-source software projects. In Information and Software Technology; Elsevier: Amsterdam, The Netherlands, 2015; Volume 64, pp. 102–112. [Google Scholar] [CrossRef] [Green Version]
Happel, H.J.; Maalej, W. Potentials and challenges of recommendation systems for software development. In Proceedings of the ACM SIGSOFT Symposium on the Foundations of Software Engineering, Atlanta, GA, USA, 9–14 November 2008; ACM Press: New York, NY, USA, 2008; pp. 11–15. [Google Scholar] [CrossRef] [Green Version]
Robillard, M.; Walker, R.; Zimmermann, T. Recommendation systems for software engineering. IEEE Softw. 2010, 27, 80–86. [Google Scholar] [CrossRef]
Sharma, L.; Gera, A. A Survey of Recommendation System: Research Challenges. Int. J. Eng. Trends Technol. 2013, 4, 1989–1992. [Google Scholar]
Nielek, R.; Jarczyk, O.; Pawlak, K.; Bukowski, L.; Bartusiak, R.; Wierzbicki, A. Choose a Job You Love: Predicting Choices of GitHub Developers. In Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence, WI 2016, Omaha, NE, USA, 13–16 October 2016; Institute of Electrical and Electronics Engineers Inc.: New York, NY, USA, 2017; pp. 200–207. [Google Scholar] [CrossRef]
Casalnuovo, C.; Vasilescu, B.; Devanbu, P.; Filkov, V. Developer On boarding in GitHub: The role of prior social links and language experience. In Proceedings of the 2015 10th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering, ESEC/FSE 2015—Proceedings, Bergamo, Italy, 30 August–4 September 2015; Association for Computing Machinery, Inc.: New York, NY, USA, 2015; pp. 817–828. [Google Scholar] [CrossRef]
Liu, C.; Yang, D.; Zhang, X.; Ray, B.; Rahman, M.M. Recommending GitHub Projects for Developer Onboarding. IEEE Access 2018, 6, 52082–52094. [Google Scholar] [CrossRef]
Onoue, S.; Hata, H.; Matsumoto, K.I. A study of the characteristics of developers’ activities in GitHub. In Proceedings of the Asia-Pacific Software Engineering Conference, APSEC, Bangkok, Thailand, 2–5 December 2013; IEEE Computer Society: Piscataway, NJ, USA, 2013; Volume 2, pp. 7–12. [Google Scholar] [CrossRef]
Murgia, A.; Concas, G.; Tonelli, R.; Ortu, M.; Demeyer, S.; Marchesi, M. On the influence of maintenance activity types on the issue resolution time. In ACM International Conference Proceeding Series; Association for Computing Machinery: New York, NY, USA, 2014; pp. 12–21. [Google Scholar] [CrossRef] [Green Version]
Jarczyk, O.; Jaroszewicz, S.; Wierzbicki, A.; Pawlak, K.; Jankowski-Lorek, M. Surgical teams on GitHub: Modeling performance of GitHub project development processes. Inf. Softw. Technol. 2018, 100, 32–46. [Google Scholar] [CrossRef]
Soares, D.M.; de Lima Júnior, M.L.; Plastino, A.; Murta, L. What factors influence the reviewer assignment to pull requests? Inf. Softw. Technol. 2018, 98, 32–43. [Google Scholar] [CrossRef]
Yu, Y.; Wang, H.; Yin, G.; Wang, T. Reviewer recommendation for pull-requests in GitHub: What can we learn from code review and bug assignment? Inf. Softw. Technol. 2016, 74, 204–218. [Google Scholar] [CrossRef]
Niu, J.; Wang, L.; Liu, X.; Yu, S. FUIR: Fusing user and item information to deal with data sparsity by using side information in recommendation systems. J. Netw. Comput. Appl. 2016, 70, 41–50. [Google Scholar] [CrossRef]
Guo, G. Resolving data sparsity and cold start in recommender systems. In Proceedings of the 20th international conference on User Modeling, Adaptation, and Personalization, Montreal, QC, Canada, 16–20 July 2012; Volume 7379 LNCS, pp. 361–364. [Google Scholar] [CrossRef]
Şeker, A.; Diri, B.; Arslan, H.; Amasyali, F. Summarising big data: Public GitHub dataset for software engineering challenges. Cumhur. Sci. J. 2020, 41, 720–724. [Google Scholar] [CrossRef]
Karabadji, N.E.I.; Beldjoudi, S.; Seridi, H.; Aridhi, S.; Dhifli, W. Improving memory-based user collaborative filtering with evolutionary multi-objective optimization. Expert Syst. Appl. 2018, 98, 153–165. [Google Scholar] [CrossRef]
Piantadosi, S.T. Zipf’s word frequency law in natural language: A critical review and future directions. Psychon. Bull. Rev. 2014, 21, 1112–1130. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Jiang, J.; Lo, D.; He, J.; Xia, X.; Kochhar, P.S.; Zhang, L. Why and how developers fork what from whom in GitHub. Empir. Softw. Eng. 2017, 22, 547–578. [Google Scholar] [CrossRef]
Sheoran, J.; Blincoe, K.; Kalliamvakou, E.; Damian, D.; Ell, J. Understanding “watchers” on GitHub. In Proceedings of the 11th Working Conference on Mining Software Repositories, MSR 2014—Proceedings, Hyderabad, India, 31 May—1 June 2014; Association for Computing Machinery, Inc.: New York, NY, USA, 2014; pp. 336–339. [Google Scholar] [CrossRef]
Kalliamvakou, E.; Gousios, G.; Blincoe, K.; Singer, L.; German, D.M.; Damian, D. An in-depth study of the promises and perils of mining GitHub. Empir. Softw. Eng. 2016, 21, 2035–2071. [Google Scholar] [CrossRef]

Figure 1. The dataset content.

Figure 2. The corpus of comments words at a glance.

Figure 3. The zipf distributions of words in comments.

Figure 4. Calculating unrated project with the help of similarity rated projects.

Figure 5. The most common used languages in the dataset.

Figure 6. The flowchart of project recommendation system.

Figure 7. The hit scores calculations with 4 different parameters combinations.

Figure 8. The most successful 5 developer metrics all hit scores (community).

Figure 9. The most successful five developer metrics all hit scores (experience).

Table 1. Used metrics samples in the literature with GitHub data.

	Metric Name	Definition	Target Challenge	Ref
Issue	Maintenance Type	The type of issue that related with (feature, bug, etc.)	Finding issue resolution time factors	[29]
	A_D_issue_rep_assi	The total number of issues that assigned to specific developers	Understanding issue closure rates	[30]
	ContainsFix	Is the pull request whether solve an issue?	Pull request prioritization	[15]
	Owned_issues	The number of issues which is opened by the project’s owner	Prediction of joining a repository	[25]
Commit	D_languages	The committed programming languages of a developer		[25]
	Contibutors_SLOC	The number of contributors per 1K code lines	Exploring of reliability metrics	[18]
	TotalLoc	The number of lines in the last commit	Defect prediction with developer experience	[6]
	ExpLoc	The code line experience of a developer on a committed file		[6]
	ExpCom	The experience of a developer on a committed file		[6]
	NumofFile	The number of modified file in the last commit		[6]
	Past Experience	The number of files committed the past same-language.	Finding factors of joining a project	[26]
	Readme tf_idf	The tf-idf values of ReadMe and code files of project	Project recommendation	[5]
Pull Request	NumCommits	The number of commit in a PR	Automatic assignment of integrators to PR	[3]
	AcceptanceRate	The acceptance rates of PRs on a project		[3]
	TotalLines	The number of changed lines in a PR		[3]
	Age	Time between opening and closing a PR	Pull request prioritization	[15]
	FileLocation	The directory of files in a PR	Automatic code-reviewer assignment	[12]
	DecisionTime	PR decision time	Exploring the openness of a software project	[16]
	Files_changes	The number of changed files (some vs many)	Understanding the factors of assigning PR reviewers	[31]
	Comment Network	The network of pull request comments	Discovering factors of the PR process	[32]

Table 2. The dataset sparsity of related studies.

Paper ID	Number of User	Number of project	Ratio (~)
[5]	1700	22,000	0.07
[26]	1255	58,092	0.02
[27]	1070	1600	0.66
[25]	62,607	9447	0.15
[30]	62,607	9447	0.15
Our study	100	41,280	0.002

Table 3. A sample project-developer matrix belongs to the metric issue_opened.

(a) Actual values of the metric
	User-1	User-2	⋯	User-100
Project-1	0	51	⋯	96
Project-2	7	0	⋯	0
Project-3	0	0	⋯	4
⋯	⋯	⋯	⋯	⋯
Project-n	0	34	⋯	0
(b) Normalized values of the metric
	User-1	User-2	⋯	User-100
Project-1	0	4	⋯	10
Project-2	0.08	0	⋯	0
Project-3	0	0	⋯	0.5
⋯	⋯	⋯	⋯	⋯
Project-n	0	3	⋯	0

Table 4. Issue comments at a glance.

User id	Issue id	Project id	Body
55859	138	1	I’d like to see this commit, or at least this feature, included. I’m trying to build something which uses the time information for a distributed team and I’d like to retain the timezone information ⋯
145667	629	27	Here is a simpler example that I think shows the problem better.\r \n require ŕubinius/debugger ⋯
⋯	⋯	⋯	⋯

Table 5. A sample project-developer matrix belongs to metricX

	Comments	Commit Comments	Issue Comments	Pull Request Comments
Project-1	like included exception ⋯	exception ⋯	like included ⋯	NaN
Project-2	⋯	⋯	⋯	⋯
⋯	⋯	⋯	⋯	NaN
Project-n	⋯	NaN	⋯	⋯

Table 6. A sample scenario for recommending with correct and half matches.

(a) Recommendations for Alice
Top-5 recommendations	Correct matches	Half matches
visionmedia/co		co
fengmk2/emoji		fengmk2
iojs/io.js	iojs/io.js
julgruber/co-read	julgruber/co-read
koajs/compose/
(b) Alice’s watched repos
Score $^{a}$	Repo full names
	adamwiggins/co
	fengmk2/parameter
	iojs/io.js
	julgruber/co-read
	fengmk2/cnpmjs.org
⋯	⋯

^a

: correct, Applsci 11 00920 i007

: half,

: wrong.

Table 7. The coding languages that experienced in past for each developer.

User id	All Languages	Expert Languages
21	[c, ruby, ruby, ruby, go, ruby, ruby, ⋯ ]	[ruby, javascript, go]
13760	[objective-c, objective-c, c, haskell, ⋯]	[objective-c, c, ruby]
3346407	[css, python, python, lua, c++, ⋯]	[python, javascript, css]
⋯	⋯	⋯

Table 8. Single developer metrics definitions.

	Metric	Definition
1	issue_opened	Number of issue opened
2	issue_commented	Number of comment to issue
3	issue_closed	Number of issue closed
4	issue_hasPR	Number of issue that has a PR
5	issue_assigned	Number of issue assigned
6	commit_commented	Number of comment to commit
7	commit_authored	Number of authorship in commit
8	commit_committed	Number of commit
9	pr_opened	Number of pull requests
10	pr_merged	Number of PR merged
11	pr_assigned	Number of PR assigned
12	pr_commented	Number of comment to PR

Table 9. Developer metrics scores according to evaluate with the community relation approach.

Table 10. Developer metrics scores according to evaluate with the language experience approach.

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Şeker, A.; Diri, B.; Arslan, H. New Developer Metrics for Open Source Software Development Challenges: An Empirical Study of Project Recommendation Systems. Appl. Sci. 2021, 11, 920. https://doi.org/10.3390/app11030920

AMA Style

Şeker A, Diri B, Arslan H. New Developer Metrics for Open Source Software Development Challenges: An Empirical Study of Project Recommendation Systems. Applied Sciences. 2021; 11(3):920. https://doi.org/10.3390/app11030920

Chicago/Turabian Style

Şeker, Abdulkadir, Banu Diri, and Halil Arslan. 2021. "New Developer Metrics for Open Source Software Development Challenges: An Empirical Study of Project Recommendation Systems" Applied Sciences 11, no. 3: 920. https://doi.org/10.3390/app11030920

APA Style

Şeker, A., Diri, B., & Arslan, H. (2021). New Developer Metrics for Open Source Software Development Challenges: An Empirical Study of Project Recommendation Systems. Applied Sciences, 11(3), 920. https://doi.org/10.3390/app11030920

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

New Developer Metrics for Open Source Software Development Challenges: An Empirical Study of Project Recommendation Systems

Abstract

1. Introduction

2. Background

3. Research Design

3.1. Dataset

3.2. Recommendation Model

3.3. Similarity Methods

3.3.1. Cosine Similarity (Context Free)

3.3.2. TF-IDF Similarity (Context Based)

3.3.3. Handling Unknown Projects

3.4. Evaluation Methods

3.4.1. Community Relation Approach

3.4.2. Language Experience Approach

4. Empirical Results

4.1. Generating Developer Metrics

4.1.1. Single Metrics

4.1.2. Fusion Metrics

4.1.3. Binary Fusion Metrics

4.2. Project Recommendation Results

4.3. Threats to Validity

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI