Fostering Productive Open Source Systems: Understanding the Impact of Collaborator Sentiment
Abstract
1. Introduction
- RQ1: Do significant differences exist in system outcomes, specifically productivity indicators (Number of PRs, Total LoC, LoC/PR), based on the internal system variable of OSS participant sentiment state (positive/neutral/negative)?
- RQ2: Which sentiment group has a more positive impact on the systemic property of community activation through active communication?
2. Theoretical Background and Literature Review
2.1. Theoretical Background
2.1.1. Systemic Understanding of the OSS Development Environment
2.1.2. The Collaboration Process in OSS
2.1.3. Productivity in Software
2.2. Related Work
2.2.1. Conversation and Attitude in PRs
2.2.2. Collaborator Attitude and Productivity
2.3. Research Problem Statement
3. Research Methods
- No. of PRs: Number of merged PRs created by an individual
- Total LoC: Total LoC written by individuals. The LoC from the merged PRs were counted.
- LoC/PR: LoC per merged PR.
- Comments: Number of comments written by an individual in the PRs. To assess community activity level, comments from all PRs (open, closed, and merged) were collected.
3.1. Selection of Target Repositories
- Exclude projects that are clearly not software projects.
- If a repository appears in both of the top 100 lists, it is classified based on its higher rank. For example, “react” (ranked 10th by Star and 22nd by Fork) is classified as a Star repository.
- Exclude repositories with an “Archived” status.
- Exclude repositories not following the fork-and-pull model [21].
3.2. Collection of PR Number Lists
Listing 1. Collecting PR number list using the “gh” command. |
gh pr list --state all --json number --jq ’ .[].number’ --limit 999999999 > pr-num-list.txt |
Listing 2. Example of collected PR number list. |
$ cat pr-num-list.txt | head -n 10 197657 197653 197652 197651 197649 197645 197644 197641 197640 197633 |
3.3. Collection of Detailed PR Information
Listing 3. Collecting detailed information based on PR number list. |
#!/bin/bash while read p; do gh pr view $p --json additions, assignees, author, autoMergeRequest, baseRefName, body, changedFiles, closed, closedAt, comments, commits, createdAt, deletions , files, headRefName, headRefOid, headRepository, headRepositoryOwner, id, isCrossRepository, isDraft, labels, latestReviews, maintainerCanModify, mergeCommit, mergeStateStatus, mergeable, mergedAt, mergedBy, milestone, number, potentialMergeCommit, projectCards, projectItems, reactionGroups, reviewDecision, reviewRequests, reviews, state, statusCheckRollup, title, updatedAt, url > $p. json; sleep 0.8s; done <pr-num-list.txt |
Listing 4. Subset of detailed information collected by Listing 3. |
{ "additions": 1, "assignees": [], "author": { "is_bot": true, "login": "app/" }, "autoMergeRequest": null, "baseRefName": "master", "body": "TESTTTT", "changedFiles": 1, "closed": true, "closedAt": "2019-03-15T21:49:12Z", "comments": [], "commits": [ { "authoredDate": "2019-03-15T21:48:32Z", "authors": [ { "email": "48438775+peter11232@users.noreply.github.com", "id": "", "login": "", "name": "peter11232" } ], "committedDate": "2019-03-15T21:48:32Z", "messageBody": "TESTTTT", "messageHeadline": "Create Peter", "oid": "38e2ca9269fd897e987343f6ed452c5cb621c08d" } ], "createdAt": "2019-03-15T21:48:54Z", "deletions": 0, "files": [ { "path": "Peter", "additions": 1, "deletions": 0 } ], "headRefName": "patch-1", "headRefOid": "38e2ca9269fd897e987343f6ed452c5cb621c08d", "headRepository": null, "headRepositoryOwner": { "login": "" }, "id": "MDExOlB1bGxSZXF1ZXN0MjYxNzA4ODIw", "isCrossRepository": true, "isDraft": false, "labels": [], "latestReviews": [], "maintainerCanModify": false, "mergeCommit": null, "mergeStateStatus": "DIRTY", "mergeable": "CONFLICTING", "mergedAt": null, "mergedBy": null, "milestone": null, "number": 70593, "potentialMergeCommit": null, "projectCards": [], "projectItems": [], "reactionGroups": [], "reviewDecision": "", "reviewRequests": [], "reviews": [], "state": "CLOSED", "statusCheckRollup": [], "title": "Create Peter", "updatedAt": "2020-03-27T02:14:21Z", "url": "https://github.com/microsoft/vscode/pull/70593" } |
3.4. Database Loading
3.5. Sentiment Analysis
- Positive Example: “Done. Briefly considered not signing you all up for installer change notifications but then did so. Fortunately:)” (Illustrative SentiStrength-SE Score: +3)
- Neutral Example: “Can you please resolve merge conflicts and make sure all builds are green?” (Illustrative SentiStrength-SE Score: 0)
- Negative Example: “IMO, I don’t think it is worth the hassle to backport this. Cry out if you disagree.” (Illustrative SentiStrength-SE Score: −3)
- Positive Group: Average sentiment score > 0
- Neutral Group: Average sentiment score = 0
- Negative Group: Average sentiment score < 0.
3.6. Data Processing
- We collected the comments and average sentiment score for each individual.
- From the PR-related tables, we accessed the PR count, total LoC, and LoC per PR for each individual. Only merged PRs were considered for these metrics, as only the merged PRs were ultimately reflected in project outcomes. LoC was used because it is the most accessible and intuitive metric for measuring productivity in OSS, allowing for the collection of large amounts of data. This metric overcomes the difficulty of using other productivity measurement techniques such as in-depth observation or function points, which are challenging to use in OSS development contexts owing to the diverse affiliations (or lack thereof) of participants who are not necessarily part of the same organization.
- We merged the two datasets (comments and PR data) based on individual participant identifiers within each repository. Participants were included in the analysis if they contributed to at least one merged PR and authored at least one comment within the respective repository. Individuals with only PR data (no comments) or only comments (no PRs) for a given repository were excluded from the dataset.
- We derived the final data containing PR count, total LoC, LoC per PR, average comment sentiment score, and comment count for each individual.
- We classified each individual based on their average comment sentiment score into Positive (>0), Neutral (0), or Negative (<0).
3.7. Statistical Analysis
- Kruskal–Wallis H test [68] (comparison among three groups)
- Mann–Whitney U test [69] (pairwise comparison between groups)
4. Results
4.1. Observed Differences in Productivity and Communication Across Sentiment Groups
- For “No. of PRs,” the “Median” indicates the central value when participants within a sentiment group are ranked by the number of merged PRs they created: 50% of the participants created many PRs or fewer and 50% created many PRs or more. The “Mean” is the average number of merged PRs per participant in that group.
- For “Total LoC,” the “Median” and “Mean” refer to the respective central and average values of the total LoC (from merged PRs only) written by each participant within each sentiment group.
- For “LoC/PR,” each participant has an average number of LoC per their own merged PR(s). The “Median” and “Mean” values in the tables represent the central and average values of these individual participant-level averages within each sentiment group.
- For “Comments,” as exemplified by the reviewer’s query, a “Median” of 2.0 for a group indicates that 50% of the participants in that group wrote two or fewer comments in PR discussions and that 50% wrote two or more. A “Mean” of 6.45 would represent the average number of comments written per participant in that group.
4.2. Observed Correlation Patterns and Potential for Feedback Loops
5. Discussion
5.1. Limitations
- The quantification of human emotion using SentiStrength-SE may involve some distortion. Similarly, the chosen productivity metrics (LoC and PR) simplify complex technical contributions, potentially obscuring the nuances of work and its full impacts.
- This study was correlational and did not establish causation. Whether sentiment influences productivity, productivity influences sentiment, or unobserved factors affect both is unclear. Establishing causality would require longitudinal or experimental designs.
- The consistently lowest performance of the neutral group across all metrics lacks a definitive explanation. This result might be linked to a higher prevalence of one-off contributors less inclined to express sentiment; however, this requires further investigation.
- The focus on top-tier GitHub projects was owing to the impossibility of analyzing all repositories, and may limit the generalizability of our findings to the broader and more diverse OSS ecosystem.
- This study did not control for potential confounding variables such as project size, application domain, or individual contributor experience. These factors can interact with sentiment and productivity in complex ways (e.g., different communication norms in larger projects and varying sentiment/productivity patterns based on experience). Future studies should address this issue by incorporating such variables.
- Systematic unavailability of participants’ demographic data (e.g., age, sex, location, professional background) or detailed contextual information on GitHub limited our ability to explore these factors.
- Average sentiment scores were computed as the arithmetic mean of all comments per participant without normalizing for the total volume of comments; consequently, a participant with one strongly valenced comment might be grouped with another having many mildly valenced comments if their averages were similar.
- All analyses treated contributor–repository instances as distinct data points. This approach implies that an individual who contributed to multiple projects within the sample was represented by multiple observations in this specific pooled dataset. Such non-independent observations could potentially impact the reliability of these specific results.
6. Conclusions
- Collecting data on individual activity duration and contribution patterns to verify whether neutral participants primarily consist of one-off contributors or exhibit significantly shorter engagement periods.
- Examining whether individuals exhibit consistent or varying sentiment across the different OSS projects to which they contribute and how this variation relates to their productivity in each project.
- Developing or applying robust methodologies to classify participants into distinct roles within OSS projects (e.g., core developers, peripheral contributors, maintainers, and issue reporters) and analyze how sentiment expression and its impact on productivity and project dynamics vary across these different roles. This classification could inform more tailored community management strategies.
- Investigating the impact of comment volume on average sentiment scores and its relationship with productivity, potentially by comparing sentiment profiles of participants with high versus low comment activity or by applying weighting schemes to sentiment aggregation.
- Investigating the stability or fluctuation of an individual’s sentiment across different contributions or over time. Understanding these temporal dynamics of sentiment is a promising direction that could provide deeper insights.
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Fuggetta, A. Open Source Software—An Evaluation. J. Syst. Softw. 2003, 66, 77–90. [Google Scholar] [CrossRef]
- Ducheneaut, N. Socialization in an Open Source Software Community: A Socio-Technical Analysis. Comput. Support. Coop. Work. (CSCW) 2005, 14, 323–368. [Google Scholar] [CrossRef]
- Baxter, G.; Sommerville, I. Socio-technical systems: From design methods to systems engineering. Interact. Comput. 2011, 23, 4–17. [Google Scholar] [CrossRef]
- West, J.; Gallagher, S. Challenges of Open Innovation: The Paradox of Firm Investment in Open-Source Software. R&D Manag. 2006, 36, 319–331. [Google Scholar] [CrossRef]
- Guterres, A. Roadmap for Digital Cooperation; United Nations: New York, NY, USA, 2020. [Google Scholar]
- World Benchmarking Alliance. Digital Inclusion Benchmark 2023 Insights Report; World Benchmarking Alliance: Amsterdam, The Netherlands, 2023. [Google Scholar]
- World Benchmarking Alliance. Digital Inclusion Benchmark 2021 Scoring Guidelines; World Benchmarking Alliance: Amsterdam, The Netherlands, 2021. [Google Scholar]
- Fredrickson, B.L. The role of positive emotions in positive psychology: The broaden-and-build theory of positive emotions. Am. Psychol. 2001, 56, 218. [Google Scholar] [CrossRef] [PubMed]
- Miller, C.; Cohen, S.; Klug, D.; Vasilescu, B.; Kästner, C. “Did you miss my comment or what?”: Understanding toxicity in open source discussions. In Proceedings of the 44th International Conference on Software Engineering, Pittsburgh, PA, USA, 21–29 May 2022; pp. 710–722. [Google Scholar] [CrossRef]
- Cohen, S. Contextualizing toxicity in open source: A qualitative study. In Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Athens, Greece, 23–28 August 2021; pp. 1669–1671. [Google Scholar] [CrossRef]
- Girardi, D.; Lanubile, F.; Novielli, N.; Serebrenik, A. Emotions and Perceived Productivity of Software Developers at the Workplace. IEEE Trans. Softw. Eng. 2022, 48, 3326–3341. [Google Scholar] [CrossRef]
- Tulili, T.R.; Rastogi, A.; Capiluppi, A. Exploring turnover, retention and growth in an OSS Ecosystem. arXiv 2025, arXiv:2504.16483. [Google Scholar]
- Ortu, M.; Adams, B.; Destefanis, G.; Tourani, P.; Marchesi, M.; Tonelli, R. Are Bullies More Productive? Empirical Study of Affectiveness vs. Issue Fixing Time. In Proceedings of the 2015 IEEE/ACM 12th Working Conference on Mining Software Repositories, Florence, Italy, 16–17 May 2015; IEEE: Piscataway, NJ, USA, 2015; pp. 303–313. [Google Scholar] [CrossRef]
- Carige Junior, R.; Carneiro, G. Impact of Developers Sentiments on Practices and Artifacts in Open Source Software Projects: A Systematic Literature Review. In Proceedings of the 22nd International Conference on Enterprise Information Systems. SCITEPRESS—Science and Technology Publications, Prague, Czech Republic, 5–7 May 2020; pp. 31–42. [Google Scholar] [CrossRef]
- Ferreira, I.; Stewart, K.; German, D.; Adams, B. A Longitudinal Study on the Maintainers’ Sentiment of a Large Scale Open Source Ecosystem. In Proceedings of the 2019 IEEE/ACM 4th International Workshop on Emotion Awareness in Software Engineering (SEmotion), Montreal, QC, Canada, 28 May 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 17–22. [Google Scholar] [CrossRef]
- Luthans, F. The need for and meaning of positive organizational behavior. J. Organ. Behav. Int. J. Ind. Occup. Organ. Psychol. Behav. 2002, 23, 695–706. [Google Scholar] [CrossRef]
- Storey, M.A.; Singer, L.; Cleary, B.; Figueira Filho, F.; Zagalsky, A. The (R) Evolution of social media in software engineering. In Proceedings of the Future of Software Engineering Proceedings, Hyderabad India, 31 May–7 June 2014; pp. 100–116. [Google Scholar] [CrossRef]
- Dwyer, C. Socio-technical Systems Theory and Environmental Sustainability. Sprouts 2011, 3–5. [Google Scholar]
- Trist, E.L.; Bamforth, K.W. Some Social and Psychological Consequences of the Longwall Method of Coal-Getting: An Examination of the Psychological Situation and Defences of a Work Group in Relation to the Social Structure and Technological Content of the Work System. Hum. Relations 1951, 4, 3–38. [Google Scholar] [CrossRef]
- Scacchi, W. Understanding the requirements for developing open source software systems. IEE Proc. Softw. 2002, 149, 24. [Google Scholar] [CrossRef]
- Padhye, R.; Mani, S.; Sinha, V.S. A Study of External Community Contribution to Open-Source Projects on GitHub. In Proceedings of the 11th Working Conference on Mining Software Repositories, Hyderabad, India, 31 May–1 June 2014; ACM: New York, NY, USA, 2014; pp. 332–335. [Google Scholar] [CrossRef]
- Soares, D.M.; De Lima Júnior, M.L.; Murta, L.; Plastino, A. Acceptance Factors of Pull Requests in Open-Source Projects. In Proceedings of the 30th Annual ACM Symposium on Applied Computing, Salamanca, Spain, 13–17 April 2015; ACM: New York, NY, USA, 2015; pp. 1541–1546. [Google Scholar] [CrossRef]
- Gousios, G.; Pinzger, M.; Deursen, A.V. An Exploratory Study of the Pull-Based Software Development Model. In Proceedings of the 36th International Conference on Software Engineering, Hyderabad India, 31 May–7 June 2014; ACM: New York, NY, USA, 2014; pp. 345–355. [Google Scholar] [CrossRef]
- Guo, Y.; Leitner, P. Studying the Impact of CI on Pull Request Delivery Time in Open Source Projects—A Conceptual Replication. PeerJ Comput. Sci. 2019, 5, e245. [Google Scholar] [CrossRef]
- LINE. Enable to Receive Compressed Request from Client by Joonhaeng. Pull Request #3087. LINE/Armeria. Available online: https://github.com/line/armeria/pull/3087 (accessed on 30 May 2025).
- Meyer, A.N.; Fritz, T.; Murphy, G.C.; Zimmermann, T. Software Developers’ Perceptions of Productivity. In Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering, Hong Kong, China, 16–21 November 2014; ACM: New York, NY, USA, 2014; pp. 19–29. [Google Scholar] [CrossRef]
- Zhou, M.; Mockus, A. Developer Fluency: Achieving True Mastery in Software Projects. In Proceedings of the Eighteenth ACM SIGSOFT International Symposium on Foundations of Software Engineering, Santa Fe, NM, USA, 7–11 November 2010; ACM: New York, NY, USA, 2010; pp. 137–146. [Google Scholar] [CrossRef]
- Kieburtz, R.; McKinney, L.; Bell, J.; Hook, J.; Kotov, A.; Lewis, J.; Oliva, D.; Sheard, T.; Smith, I.; Walton, L. A Software Engineering Experiment in Software Component Generation. In Proceedings of the IEEE 18th International Conference on Software Engineering, Berlin, Germany, 25–30 March 1996; pp. 542–552. [Google Scholar] [CrossRef]
- Devanbu, P.; Karstu, S.; Melo, W.; Thomas, W. Analytical and Empirical Evaluation of Software Reuse Metrics. In Proceedings of the IEEE 18th International Conference on Software Engineering, Berlin, Germany, 25–30 March 1996; pp. 189–199. [Google Scholar] [CrossRef]
- Blackburn, J.; Scudder, G.; Van Wassenhove, L. Improving Speed and Productivity of Software Development: A Global Survey of Software Developers. IEEE Trans. Softw. Eng. 1996, 22, 875–885. [Google Scholar] [CrossRef]
- Delorey, D.P.; Knutson, C.D.; Chun, S. Do Programming Languages Affect Productivity? A Case Study Using Data from Open Source Projects. In Proceedings of the First International Workshop on Emerging Trends in FLOSS Research and Development (FLOSS’07: ICSE Workshops 2007), Minneapolis, MN, USA, 20–26 May 2007; IEEE: Piscataway, NJ, USA, 2007; p. 8. [Google Scholar] [CrossRef]
- Symons, C. Function Point Analysis: Difficulties and Improvements. IEEE Trans. Softw. Eng. 1988, 14, 2–11. [Google Scholar] [CrossRef]
- Jiang, Q.; Lee, Y.C.; Davis, J.G.; Zomaya, A.Y. Diversity, Productivity, and Growth of Open Source Developer Communities. arXiv 2018, arXiv:1809.03725. [Google Scholar]
- Thurstone, L.L. The measurement of social attitudes. J. Abnorm. Soc. Psychol. 1931, 26, 249. [Google Scholar] [CrossRef]
- Sarnoff, I. Psychoanalytic theory and social attitudes. Public Opin. Q. 1960, 24, 251–279. [Google Scholar] [CrossRef]
- Greenwald, A.G. On Defining Attitude and Attitude Theory. In Psychological Foundations of Attitudes; Elsevier: Amsterdam, The Netherlands, 1968; pp. 361–388. [Google Scholar] [CrossRef]
- Strzalkowski, T.; Harrison, T.; Sa, N.; Katsios, G.; Khoja, E. GitHub as a Social Network. In Advances in Artificial Intelligence, Software and Systems Engineering; Springer: Cham, Switzerland, 2018; pp. 379–390. ISSN 2194-5365. [Google Scholar] [CrossRef]
- Ng, C.L.; Siew-Hoong, A.; Tong-Ming, L. A Study on the Element of Sentiment toward Knowledge Sharing among Knowledge Workers in a Virtual CoP. Inf. Manag. Bus. Rev. 2013, 5, 553–560. [Google Scholar] [CrossRef]
- Dehbozorgi, N.; Maher, M.L.; Dorodchi, M. Emotion Mining from Speech in Collaborative Learning. Adv. Sci. Technol. Eng. Syst. J. 2021, 6, 90–100. [Google Scholar] [CrossRef]
- Joshi, R.B. Reinforcement Learning for GitHub Pull Request Predictions: Analyzing Development Dynamics. Master’s Thesis, Carleton University, Ottawa, ON, Canada, 2023. [Google Scholar] [CrossRef]
- Guzman, E.; Azócar, D.; Li, Y. Sentiment Analysis of Commit Comments in GitHub: An Empirical Study. In Proceedings of the 11th Working Conference on Mining Software Repositories, Hyderabad, India, 31 May–1 June 2014; ACM: New York, NY, USA, 2014; pp. 352–355. [Google Scholar] [CrossRef]
- Asri, I.E.; Kerzazi, N.; Uddin, G.; Khomh, F.; Janati Idrissi, M. An Empirical Study of Sentiments in Code Reviews. Inf. Softw. Technol. 2019, 114, 37–54. [Google Scholar] [CrossRef]
- Huq, S.F.; Sadiq, A.Z.; Sakib, K. Is Developer Sentiment Related to Software Bugs: An Exploratory Study on GitHub Commits. In Proceedings of the 2020 IEEE 27th International Conference on Software Analysis, Evolution and Reengineering (SANER), London, ON, Canada, 18–21 February 2020; IEEE: Piscataway, NJ, USA, 2020; pp. 527–531. [Google Scholar] [CrossRef]
- Li, L.; Cao, J.; Lo, D. Sentiment Analysis over Collaborative Relationships in Open Source Software Projects. In Proceedings of the International Conference on Software Engineering and Knowledge Engineering, Pittsburgh, PA, USA, 9–19 July 2020. [Google Scholar]
- Steinmacher, I.; Conte, T.; Gerosa, M.A.; Redmiles, D. Social Barriers Faced by Newcomers Placing Their First Contribution in Open Source Software Projects. In Proceedings of the 18th ACM Conference on Computer Supported Cooperative Work & Social Computing, Vancouver, BC, Canada, 14–18 March 2015; ACM: New York, NY, USA, 2015; pp. 1379–1392. [Google Scholar] [CrossRef]
- Licorish, S.A.; MacDonell, S.G. Exploring the Links between Software Development Task Type, Team Attitudes and Task Completion Performance: Insights from the Jazz Repository. Inf. Softw. Technol. 2018, 97, 10–25. [Google Scholar] [CrossRef]
- Wagner, S.; Ruhe, M. A Systematic Review of Productivity Factors in Software Development. In Proceedings of the 2nd International Workshop on Software Productivity Analysis and Cost Estimation (SPACE 2008), Beijing, China, 9 July 2008. [Google Scholar]
- Meyer, A.N.; Zimmermann, T.; Fritz, T. Characterizing Software Developers by Perceptions of Productivity. In Proceedings of the 2017 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM), Toronto, ON, Canada, 9–10 November 2017; pp. 105–110. [Google Scholar] [CrossRef]
- Murphy-Hill, E.; Jaspan, C.; Sadowski, C.; Shepherd, D.; Phillips, M.; Winter, C.; Knight, A.; Smith, E.; Jorde, M. What Predicts Software Developers’ Productivity? IEEE Trans. Softw. Eng. 2021, 47, 582–594. [Google Scholar] [CrossRef]
- Satratzemi, M.; Xinogalos, S.; Tsompanoudi, D.; Karamitopoulos, L. Examining Student Performance and Attitudes on Distributed Pair Programming. Sci. Program 2018, 1–8. [Google Scholar] [CrossRef]
- Anany, M.; Hussien, H.; Aly, S.; Sakr, N. Influence of Emotions on Software Developer Productivity. In Proceedings of the 9th International Conference on Pervasive and Embedded Computing and Communication Systems, Vienna, Austria, 19–20 September 2019; pp. 75–82. [Google Scholar] [CrossRef]
- Wachs, J.; Nitecki, M.; Schueller, W.; Polleres, A. The geography of open source software: Evidence from github. Technol. Forecast. Soc. Chang. 2022, 176, 121478. [Google Scholar] [CrossRef]
- Schreiber, R.R. Organizational influencers in open-source software projects. Int. J. Open Source Softw. Process. (IJOSSP) 2023, 14, 1–20. [Google Scholar] [CrossRef]
- Rajanen, M.; Iivari, N. Examining usability work and culture in OSS. In Proceedings of the Open Source Systems: Adoption and Impact: 11th IFIP WG 2.13 International Conference, OSS 2015, Florence, Italy, 16–17 May 2015; Proceedings 11. Springer: Berlin/Heidelberg, Germany, 2015; pp. 58–67. [Google Scholar]
- Qiu, H.S.; Li, Y.L.; Padala, S.; Sarma, A.; Vasilescu, B. The signals that potential contributors look for when choosing open-source projects. Proc. ACM Hum.-Comput. Interact. 2019, 3, 1–29. [Google Scholar] [CrossRef]
- Kalliamvakou, E.; Gousios, G.; Blincoe, K.; Singer, L.; German, D.M.; Damian, D. The Promises and Perils of Mining GitHub. In Proceedings of the 11th Working Conference on Mining Software Repositories, Hyderabad, India, 31 May–1 June 2014; ACM: New York, NY, USA, 2014; pp. 92–101. [Google Scholar] [CrossRef]
- Kalliamvakou, E.; Gousios, G.; Blincoe, K.; Singer, L.; German, D.M.; Damian, D. An In-Depth Study of the Promises and Perils of Mining GitHub. Empir. Softw. Eng. 2016, 21, 2035–2071. [Google Scholar] [CrossRef]
- Borges, H.; Hora, A.; Valente, M.T. Understanding the Factors That Impact the Popularity of GitHub Repositories. In Proceedings of the 2016 IEEE International Conference on Software Maintenance and Evolution (ICSME), Raleigh, NC, USA, 2–7 October 2016; pp. 334–344. [Google Scholar] [CrossRef]
- Jiang, J.; Lo, D.; He, J.; Xia, X.; Kochhar, P.S.; Zhang, L. Why and How Developers Fork What from Whom in GitHub. Empir. Softw. Eng. 2017, 22, 547–578. [Google Scholar] [CrossRef]
- Jemerov, D.; Isakova, S. Kotlin in Action; Simon and Schuster: New York, NY, USA, 2017. [Google Scholar]
- Reese, G. Database Programming with JDBC and JAVA; O’Reilly Media, Inc.: Newton, MA, USA, 2000. [Google Scholar]
- Islam, M.R.; Zibran, M.F. SentiStrength-SE: Exploiting Domain Specificity for Improved Sentiment Analysis in Software Engineering Text. J. Syst. Softw. 2018, 145, 125–146. [Google Scholar] [CrossRef]
- Thelwall, M.; Buckley, K.; Paltoglou, G. Sentiment Strength Detection for the Social Web. J. Am. Soc. Inf. Sci. Technol. 2012, 63, 163–173. [Google Scholar] [CrossRef]
- Ortu, M.; Murgia, A.; Destefanis, G.; Tourani, P.; Tonelli, R.; Marchesi, M.; Adams, B. The emotional side of software developers in JIRA. In Proceedings of the 13th International Conference on Mining Software Repositories, Austin, TX, USA, 14–22 May 2016; pp. 480–483. [Google Scholar] [CrossRef]
- Lin, B.; Zampetti, F.; Bavota, G.; Di Penta, M.; Lanza, M.; Oliveto, R. Sentiment analysis for software engineering: How far can we go? In Proceedings of the 40th International Conference on Software Engineering, Gothenburg, Sweden, 27 May–3 June 2018; pp. 94–104. [Google Scholar] [CrossRef]
- Islam, M.R.; Zibran, M.F. Leveraging Automated Sentiment Analysis in Software Engineering. In Proceedings of the 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR), Buenos Aires, Argentina, 20–21 May 2017; pp. 203–214. [Google Scholar] [CrossRef]
- Obaidi, M.; Nagel, L.; Specht, A.; Klünder, J. Sentiment Analysis Tools in Software Engineering: A Systematic Mapping Study. Inf. Softw. Technol. 2022, 151, 107018. [Google Scholar] [CrossRef]
- Kruskal, W.H.; Wallis, W.A. Use of ranks in one-criterion variance analysis. J. Am. Stat. Assoc. 1952, 47, 583–621. [Google Scholar] [CrossRef]
- Mann, H.B.; Whitney, D.R. On a test of whether one of two random variables is stochastically larger than the other. Ann. Math. Stat. 1947, 18, 50–60. [Google Scholar] [CrossRef]
- Myers, J.; Well, A.; Lorch, R. Research Design and Statistical Analysis, 3rd ed.; Taylor & Francis: Oxfordshire, UK, 2013. [Google Scholar]
- Kendall, M.G. A New Measure of Rank Correlation. Biometrika 1938, 30, 81–93. [Google Scholar] [CrossRef]
- Python Software Foundation. Python 3.11.7 Documentation. Available online: https://docs.python.org/3.11/ (accessed on 30 May 2025).
- McKinney, W. Data Structures for Statistical Computing in Python. In Proceedings of the 9th Python in Science Conference, Austin, TX, USA, 28 June–3 July 2010; Walt, S.v.d., Millman, J., Eds.; 2010; pp. 56–61. [Google Scholar] [CrossRef]
- Virtanen, P.; Gommers, R.; Oliphant, T.E.; Haberland, M.; Reddy, T.; Cournapeau, D.; Burovski, E.; Peterson, P.; Weckesser, W.; Bright, J.; et al. SciPy 1.0: Fundamental Algorithms for Scientific Computing in Python. Nat. Methods 2020, 17, 261–272. [Google Scholar] [CrossRef]
- Dunn, O.J. Multiple Comparisons Among Means. J. Am. Stat. Assoc. 1961, 56, 52–64. [Google Scholar] [CrossRef]
- Dunnett, C.W. A Multiple Comparison Procedure for Comparing Several Treatments with a Control. J. Am. Stat. Assoc. 1955, 50, 1096–1121. [Google Scholar] [CrossRef]
- Rea, L.M.; Parker, R.A. Designing and Conducting Survey Research: A Comprehensive Guide; John Wiley & Sons: Hoboken, NJ, USA, 2014. [Google Scholar]
- Vargha, A.; Delaney, H.D. A Critique and Improvement of the CL Common Language Effect Size Statistics of McGraw and Wong. J. Educ. Behav. Stat. 2000, 25, 101–132. [Google Scholar] [CrossRef]
- Sterman, J. System Dynamics: Systems Thinking and Modeling for a Complex World; Working Paper; Accepted: 2016-06-01T12:52:00Z; Massachusetts Institute of Technology, Engineering Systems Division: Cambridge, MA, USA, 2002. [Google Scholar]
- Pearce, J.M. The Case for Open Source Appropriate Technology. Environ. Dev. Sustain. 2012, 14, 425–431. [Google Scholar] [CrossRef]
- Hoe, N.S. Breaking Barriers: The Potential of Free and Open Source Software for Sustainable Human Development; United Nations Development Programme: New York, NY, USA, 2007. [Google Scholar]
- Weiss, H.M.; Cropanzano, R. Affective events theory. Res. Organ. Behav. 1996, 18, 1–74. [Google Scholar]
Star Top 10 | Fork Top 10 | ||
---|---|---|---|
Repository Name | Data Collection Date | Repository Name | Data Collection Date |
react | 5 November 2023 | tensorflow | 31 October 2023 |
ohmyzsh | 5 November 2023 | bootstrap | 25 October 2023 |
flutter | 4 November 2023 | opencv | 30 October 2023 |
vscode | 8 November 2023 | kubernetes | 29 October 2023 |
AutoGPT | 2 November 2023 | bitcoin | 24 October 2023 |
transformers | 7 November 2023 | three.js | 1 November 2023 |
next.js | 4 November 2023 | qmk_firmware | 30 October 2023 |
react-native | 6 November 2023 | material-ui | 29 October 2023 |
electron | 2 November 2023 | django | 27 October 2023 |
stable-diffusion-webui | 7 November 2023 | cpython | 26 October 2023 |
Category | Number of Contributors | Median | Mean | |
---|---|---|---|---|
total | 24,607 | - | - | |
No. of PRs | 2.0 | 17.37 | ||
Positive | Total LoC | 12,262 | 86.0 | 9865.64 |
LoC/PR | 39.12 | 332.31 | ||
Comments | 6.0 | 91.65 | ||
No. of PRs | 1.0 | 3.22 | ||
Neutral | Total LoC | 7647 | 17.0 | 620.69 |
LoC/PR | 12.0 | 135.34 | ||
Comments | 2.0 | 6.45 | ||
No. of PRs | 2.0 | 10.04 | ||
Negative | Total LoC | 4698 | 51.0 | 7661.10 |
LoC/PR | 28.0 | 262.59 | ||
Comments | 5.0 | 46.23 |
Category | Number of Contributors | Median | Mean | |
---|---|---|---|---|
total | 8321 | - | - | |
No. of PRs | 1.0 | 16.65 | ||
Positive | Total LoC | 4232 | 62.5 | 5832.69 |
LoC/PR | 34.4 | 201.42 | ||
Comments | 4.0 | 55.96 | ||
No. of PRs | 1.0 | 5.14 | ||
Neutral | Total LoC | 2632 | 17.0 | 336.69 |
LoC/PR | 13.0 | 81.47 | ||
Comments | 1.0 | 8.13 | ||
No. of PRs | 1.0 | 11.85 | ||
Negative | Total LoC | 1457 | 42.0 | 3372.37 |
LoC/PR | 25.0 | 151.09 | ||
Comments | 4.0 | 30.18 |
Category | Number of Contributors | Median | Mean | |
---|---|---|---|---|
total | 16,286 | - | - | |
No. of PRs | 2.0 | 17.75 | ||
Positive | Total LoC | 8030 | 104.0 | 11,991.10 |
LoC/PR | 43.05 | 401.29 | ||
Comments | 7.0 | 110.47 | ||
No. of PRs | 1.0 | 2.21 | ||
Neutral | Total LoC | 5015 | 17.0 | 769.74 |
LoC/PR | 12.0 | 163.62 | ||
Comments | 2.0 | 5.58 | ||
No. of PRs | 2.0 | 9.22 | ||
Negative | Total LoC | 3241 | 58.0 | 9589.11 |
LoC/PR | 29.0 | 312.71 | ||
Comments | 6.0 | 53.45 |
Category | Kruskal–Wallis H Test | Mann–Whitney U Test | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Positive-Neutral | Neutral-Negative | Positive-Negative | ||||||||||
Stat. | -Value | Stat. | -Value | Stat. | -Value | Stat. | -Value | |||||
No. of PRs | 1586.55 | 0.064 | < | 60,981,076.0 | 0.300 | < | 13,562,406.0 | −0.244 | < | 26,904,913.0 | −0.065 | < |
Total LoC | 1790.90 | 0.072 | < | 63,403,497.0 | 0.352 | < | 13,324,718.0 | −0.258 | < | 25,763,927.5 | −0.105 | < |
LoC/PR | 1172.00 | 0.047 | < | 60,285,774.5 | 0.285 | < | 14,415,289.5 | −0.197 | < | 26,014,063.0 | −0.096 | < |
Comments | 3971.18 | 0.161 | < | 70,109,454.5 | 0.495 | < | 9,059,851.5 | −0.495 | < | 27,658,384.5 | −0.039 | < |
Category | Kruskal–Wallis H Test | Mann–Whitney U Test | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Positive-Neutral | Neutral-Negative | Positive-Negative | ||||||||||
Stat. | -Value | Stat. | -Value | Stat. | -Value | Stat. | -Value | |||||
No. of PRs | 394.82 | 0.047 | < | 6,973,654.0 | 0.252 | < | 1,548,296.5 | −0.192 | < | 2,894,748.0 | −0.061 | < |
Total LoC | 460.53 | 0.055 | < | 7,270,662.5 | 0.305 | < | 1,501,109.0 | −0.217 | < | 2,789,273.5 | −0.095 | < |
LoC/PR | 304.90 | 0.036 | < | 6,954,815.0 | 0.248 | < | 1,596,388.0 | −0.167 | < | 2,811,140.5 | −0.088 | < |
Comments | 1300.30 | 0.156 | < | 8,260,316.5 | 0.483 | < | 995,769.0 | −0.480 | < | 2,969,978.5 | −0.036 | * |
Category | Kruskal–Wallis H Test | Mann–Whitney U Test | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Positive-Neutral | Neutral-Negative | Positive-Negative | ||||||||||
Stat. | -Value | Stat. | -Value | Stat. | -Value | Stat. | -Value | |||||
No. of PRs | 1211.81 | 0.074 | < | 26,721,644.0 | 0.327 | < | 5,952,711.0 | −0.267 | < | 12,032,923.5 | −0.075 | < |
Total LoC | 1345.71 | 0.082 | < | 27,723,557.0 | 0.376 | < | 5,891,048.0 | −0.275 | < | 11,501,190.5 | −0.116 | < |
LoC/PR | 873.12 | 0.053 | < | 26,267,469.5 | 0.304 | < | 6,420,072.5 | −0.210 | < | 11,654,302.5 | −0.104 | < |
Comments | 2713.92 | 0.166 | < | 30,344,676.5 | 0.507 | < | 4,051,938.5 | −0.501 | < | 12,328,809.5 | −0.052 | < |
Category | Spearman’s | Kendall’s Tau | |||
---|---|---|---|---|---|
Correlation Coefficient | -Value | Correlation Coefficient | -Value | ||
No. of PRs vs. Total LoC | 0.6944 | < | 0.5549 | < | |
Positive | No. of PRs vs. Comments | 0.6793 | < | 0.5541 | < |
Total LoC vs. Comments | 0.6025 | < | 0.4465 | < | |
No. of PRs vs. Total LoC | 0.4585 | < | 0.3722 | < | |
Neutral | No. of PRs vs. Comments | 0.3626 | < | 0.3156 | < |
Total LoC vs. Comments | 0.3427 | < | 0.2614 | < | |
No. of PRs vs. Total LoC | 0.6471 | < | 0.5157 | < | |
Negative | No. of PRs vs. Comments | 0.5908 | < | 0.4779 | < |
Total LoC vs. Comments | 0.5294 | < | 0.3875 | < |
Category | Spearman’s | Kendall’s Tau | |||
---|---|---|---|---|---|
Correlation Coefficient | -Value | Correlation Coefficient | -Value | ||
No. of PRs vs. Total LoC | 0.6610 | < | 0.5335 | < | |
Positive | No. of PRs vs. Comments | 0.6076 | < | 0.4992 | < |
Total LoC vs. Comments | 0.5687 | < | 0.4247 | < | |
No. of PRs vs. Total LoC | 0.4259 | < | 0.3458 | < | |
Neutral | No. of PRs vs. Comments | 0.3176 | < | 0.2808 | < |
Total LoC vs. Comments | 0.3066 | < | 0.2360 | < | |
No. of PRs vs. Total LoC | 0.6187 | < | 0.4999 | < | |
Negative | No. of PRs vs. Comments | 0.5344 | < | 0.4388 | < |
Total LoC vs. Comments | 0.5190 | < | 0.3841 | < |
Category | Spearman’s | Kendall’s Tau | |||
---|---|---|---|---|---|
Correlation Coefficient | -Value | Correlation Coefficient | -Value | ||
No. of PRs vs. Total LoC | 0.7077 | < | 0.5623 | < | |
Positive | No. of PRs vs. Comments | 0.7113 | < | 0.5799 | < |
Total LoC vs. Comments | 0.6159 | < | 0.4550 | < | |
No. of PRs vs. Total LoC | 0.4740 | < | 0.3847 | < | |
Neutral | No. of PRs vs. Comments | 0.3840 | < | 0.3319 | < |
Total LoC vs. Comments | 0.3617 | < | 0.2752 | < | |
No. of PRs vs. Total LoC | 0.6552 | < | 0.5192 | < | |
Negative | No. of PRs vs. Comments | 0.6097 | < | 0.4912 | < |
Total LoC vs. Comments | 0.5292 | < | 0.3864 | < |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Lee, J.; Cho, K. Fostering Productive Open Source Systems: Understanding the Impact of Collaborator Sentiment. Systems 2025, 13, 445. https://doi.org/10.3390/systems13060445
Lee J, Cho K. Fostering Productive Open Source Systems: Understanding the Impact of Collaborator Sentiment. Systems. 2025; 13(6):445. https://doi.org/10.3390/systems13060445
Chicago/Turabian StyleLee, Joonhaeng, and Keuntae Cho. 2025. "Fostering Productive Open Source Systems: Understanding the Impact of Collaborator Sentiment" Systems 13, no. 6: 445. https://doi.org/10.3390/systems13060445
APA StyleLee, J., & Cho, K. (2025). Fostering Productive Open Source Systems: Understanding the Impact of Collaborator Sentiment. Systems, 13(6), 445. https://doi.org/10.3390/systems13060445