Would You Fix This Code for Me? Effects of Repair Source and Commenting on Trust in Code Repair
Abstract
:1. Introduction
2. Related Work
2.1. GenProg
2.2. Trust in Automation
2.3. Trust in Code
2.4. Trust Biases
2.5. Comments
2.6. Code Reputation and Cognitive Biases
3. Method
3.1. Participants
3.2. Measures
3.2.1. Trustworthiness
3.2.2. Trust Intentions
3.3. Stimuli
3.4. Procedure
4. Results
4.1. Trustworthiness
4.2. Trust Intentions
4.3. Use Endorsement
4.4. Qualitative Coding
5. Discussion
5.1. Source
5.2. Commenting
5.3. Implications
5.4. Limitations
Supplementary Materials
Author Contributions
Funding
Conflicts of Interest
Ethical Statements
References
- Lee, J.D.; See, K.A. Trust in automation: Designing for appropriate reliance. Hum. Factors 2004, 46, 50–80. [Google Scholar] [CrossRef]
- Britton, T.; Jeng, L.; Carver, G.; Cheak, P.; Katzenellenbogen, T. Reversible Debugging Software; Technical Report for University of Cambridge Judge Business School: Cambridge, UK, 2013. [Google Scholar]
- German, A. Software static code analysis lessons learned. Crosstalk 2003, 16, 19–22. [Google Scholar]
- Arcuri, A. On the automation of fixing software bugs. In Proceedings of the 30th International Conference on Software Engineering, Leipzig, Germany, 10–18 May 2008; pp. 1003–1006. [Google Scholar]
- Weimer, W.; Nguyen, T.; Le Goues, C.; Forrest, S. Automatically finding patches using genetic programming. In Proceedings of the 31st International Conference on Software Engineering, Vancouver, BC, Canada, 16–24 May 2009; pp. 364–374. [Google Scholar]
- Gazzola, L.; Mariani, L.; Micucci, D. Automatic Software Repair: A Survey. In Proceedings of the 2018 IEEE/ACM 40th International Conference on Software Engineering, Gothenburg, Sweden, 27 May–3 June 2018; p. 1219. [Google Scholar]
- Martinez, M.; Monperrus, M. Astor: Exploring the design space of generate-and-validate program repair beyond GenProg. J. Syst. Softw. 2019, 151, 65–80. [Google Scholar] [CrossRef] [Green Version]
- Wickens, C.D.; Li, H.; Santamaria, A.; Sebok, A.; Sarter, N.B. Stages and levels of automation: An integrated meta-analysis. In Proceedings of the Human Factors and Ergonomics Society Annual Meeting, San Francisco, CA, USA, 27 September–1 October 2010; Volume 54, pp. 389–393. [Google Scholar]
- Alarcon, G.M.; Militello, L.G.; Ryan, P.; Jessup, S.A.; Calhoun, C.S.; Lyons, J.B. A descriptive model of computer code trustworthiness. J. Cog. Eng. Decis. Mak. 2017, 11, 107–121. [Google Scholar] [CrossRef]
- Banker, R.D.; Kauffman, R.J. Reuse and productivity in integrated computer-aided software engineering: An empirical study. MIS Q. 1991, 15, 375–401. [Google Scholar] [CrossRef] [Green Version]
- Lim, W.C. Effects of reuse on quality, productivity, and economics. IEEE Softw. 1994, 11, 23–30. [Google Scholar] [CrossRef]
- Albayrak, Ö.; Davenport, D. Impact of maintainability defects on code inspections. In Proceedings of the 2010 ACM-IEEE International Symposium on Empirical Software Engineering and Measurement, Bolzano-Bozen, Italy, 16–17 September 2010; ACM: New York, NY, USA, 2010; pp. 50–53. [Google Scholar]
- Beller, M.; Bacchelli, A.; Zaidman, A.; Juergens, E. Modern Code Reviews in Open-Source Projects: Which Problems Do They Fix? In Proceedings of the 11th Working Conference on Mining Software Repositories, Hyderabad, India, 31 May–1 June 2014; ACM: New York, NY, USA, 2014; pp. 202–211. [Google Scholar]
- Alarcon, G.; Ryan, T. Trustworthiness Perceptions of Computer Code: A Heuristic-Systematic Processing Model. In Proceedings of the 51st Hawaii International Conference on System Sciences, Waikoloa Village, HI, USA, 3–6 January 2018. [Google Scholar]
- Chaiken, S. Heuristic versus systematic information processing and the use of source versus message cues in persuasion. J. Personal. Soc. Psychol. 1980, 39, 752–766. [Google Scholar] [CrossRef]
- Alarcon, G.M.; Gamble, R.; Jessup, S.A.; Walter, C.; Ryan, T.J.; Wood, D.W.; Calhoun, C.S. Application of the heuristic-systematic model to computer code trustworthiness: The influence of reputation and transparency. Cogent Psychol. 2017, 4, 1389640. [Google Scholar] [CrossRef]
- Capiola, A.; Nelson, A.D.; Walter, C.; Ryan, T.J.; Jessup, S.A.; Alarcon, G.M.; Gamble, R.F.; Pfahler, M.D. Trust in Software: Attributes of Computer Code and the Human Factors that Influence Utilization Metrics. In Proceedings of the International Conference on Human-Computer Interaction, Orlando, FL, USA, 26–31 July 2019; Springer: Cham, Switzerland, 2019; pp. 190–196. [Google Scholar]
- Ryan, T.J.; Walter, C.; Alarcon, G.M.; Gamble, R.F.; Jessup, S.A.; Capiola, A.A. Individual Differences in Trust in Code: The Moderating Effects of Personality on the Trustworthiness-Trust Rrelationship. In Proceedings of the International Conference on Human-Computer Interaction, Las Vegas, NV, USA, 15–20 July 2018; Springer: Cham, Switzerland, 2018; pp. 370–376. [Google Scholar]
- Walter, C.; Gamble, R.; Alarcon, G.; Jessup, S.; Calhoun, C. Developing a mechanism to study code trustworthiness. In Proceedings of the 50th Hawaii International Conference on System Sciences, Waikoloa Village, HI, USA, 4–7 January 2017. [Google Scholar]
- Alarcon, G.M.; Gamble, R.F.; Ryan, T.J.; Walter, C.; Jessup, S.A.; Wood, D.W.; Capiola, A. The influence of commenting validity, placement, and style on perceptions of computer code trustworthiness: A heuristic-systematic processing approach. Appl. Ergon. 2018, 70, 182–193. [Google Scholar] [CrossRef]
- Ryan, T.J.; Alarcon, G.M.; Walter, C.; Gamble, R.; Jessup, S.A.; Capiola, A.; Pfahler, M.D. Trust in Automated Software Repair. In Proceedings of the International Conference on Human-Computer Interaction, Orlando, FL, USA, 26–31 July 2019; Springer: Cham, Switzerland, 2019; pp. 452–470. [Google Scholar]
- Chaiken, S.; Maheswaran, D. Heuristic processing can bias systematic processing: Effects of source credibility, argument ambiguity, and task importance on attitude judgment. J. Personal. Soc. Psychol. 1994, 66, 460–473. [Google Scholar] [CrossRef]
- Madhavan, P.; Wiegmann, D.A. Similarities and differences between human–human and human–automation trust: An integrative review. Theor. Issues Ergon. Sci. 2007, 8, 277–301. [Google Scholar] [CrossRef]
- Dijkstra, J.J. User agreement with incorrect expert system advice. Behav. Inf. Technol. 1999, 18, 399–411. [Google Scholar] [CrossRef]
- Dzindolet, M.T.; Pierce, L.G.; Beck, H.P.; Dawe, L.A.; Anderson, B.W. Predicting misuse and disuse of combat identification systems. Mil. Psychol. 2001, 13, 147–164. [Google Scholar] [CrossRef]
- Tenny, T. Program readability: Procedures versus comments. IEEE Trans. Softw. Eng. 1988, 14, 1271–1279. [Google Scholar] [CrossRef]
- Aman, H. An Empirical Analysis of the Impact of Comment Statements on Fault-Proneness of Small-Size Module. In Proceedings of the 2012 19th Asia-Pacific Software Engineering Conference, Hong Kong, China, 4–7 December 2012; IEEE: Washington, DC, USA, 2012; pp. 362–367. [Google Scholar]
- Aman, H.; Amasaki, S.; Sasaki, T.; Kawahara, M. Empirical Analysis of Change-Proneness in Methods Having Local Variables with Long Names and Comments. In Proceedings of the 2015 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, Beijing, China, 22–23 October 2015; IEEE: Washington, DC, USA, 2015; pp. 1–4. [Google Scholar]
- Lyons, J.B.; Ho, N.T.; Koltai, K.S.; Masequesmay, G.; Skoog, M.; Cacanindin, A.; Johnson, W.W. Trust-based analysis of an Air Force collision avoidance system. Ergon. Des. 2016, 24, 9–12. [Google Scholar] [CrossRef]
- Aman, H.; Amasaki, S.; Yokogawa, T.; Kawahara, M. A Doc2Vec-Based Assessment of Comments and Its Application to Change-Prone Method Analysis. In Proceedings of the 2018 25th Asia-Pacific Software Engineering Conference, Nara, Japan, 4–7 December 2018; IEEE: Washington, DC, USA, 2018; pp. 643–647. [Google Scholar]
- De Vries, P.; Midden, C. Effect of indirect information on system trust and control allocation. Behav. Inf. Technol. 2008, 27, 17–29. [Google Scholar] [CrossRef]
- Le, D.X.B.; Bao, L.; Lo, D.; Xia, X.; Li, S.; Pasareanu, C. On Reliability of Patch Correctness Assessment. In Proceedings of the 2019 IEEE/ACM International Conference on Software Engineering, Montréal, QC, Canada, 25 May–1 June 2019; IEEE: Washington, DC, USA, 2019; pp. 524–535. [Google Scholar]
- Wang, S.; Wen, M.; Chen, L.; Yi, X.; Mao, X. How Different is it between Machine Generated and Developer Provided Patches? An Empirical Study on the Correct Patches Generated by Automated Program Repair Techniques. In Proceedings of the 2019 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, Porto Galinhas, Brazil, 19–20 September 2019; ACM: New York, NY, USA, 2019; pp. 1–12. [Google Scholar]
- Parasuraman, R.; Sheridan, T.B.; Wickens, C.D. A model for types and levels of human interaction with automation. IEEE Trans. Syst. Man Cybern. Part. A Syst. Hum. 2000, 30, 286–297. [Google Scholar] [CrossRef] [Green Version]
- Chen, J.Y.; Barnes, M.J. Human–agent teaming for multirobot control: A review of human factors issues. IEEE Trans. Hum.-Mach. Syst. 2014, 44, 13–29. [Google Scholar] [CrossRef]
- Schaefer, K.E.; Chen, J.Y.; Szalma, J.L.; Hancock, P.A. A meta-analysis of factors influencing the development of trust in automation: Implications for understanding autonomy in future systems. Hum. Factors 2016, 58, 377–400. [Google Scholar] [CrossRef] [Green Version]
- Arkin, R.C.; Ulam, P.; Wagner, A.R. Moral decision making in autonomous systems: Enforcement, moral emotions, dignity, trust, and deception. Proc. IEEE 2012, 100, 571–589. [Google Scholar] [CrossRef]
- Mosier, K.L.; Skitka, L.J. Human Decision Makers and Automated Decision Aids. In Automation and Human Performance: Theory and Applications; Parasuraman, R., Mouloua, S., Eds.; Lawrence Erlbaum: Mahwah, NJ, USA, 1996; pp. 201–220. [Google Scholar]
- Lewandowsky, S.; Mundy, M.; Tan, G.P.A. The dynamics of trust: Comparing humans to automation. J. Exp. Psychol. Appl. 2000, 6, 104–123. [Google Scholar] [CrossRef] [PubMed]
- Smith, E.K.; Barr, E.T.; Le Goues, C.; Brun, Y. Is the Cure Worse Than the Disease? Overfitting in Automated Program Repair. In Proceedings of the 2015 Joint Meeting on Foundations in Software Engineering, Bergamo, Italy, 30 August–4 September 2015; ACM: New York, NY, USA; pp. 532–543. [Google Scholar]
- Nakajima, H.; Higo, Y.; Yokoyama, H.; Kusumoto, S. Toward Developer-Like Automated Program Repair—Modification Comparisons between GenProg and Developers. In Proceedings of the 2016 23rd Asia-Pacific Software Engineering Conference, Hamilton, New Zealand, 6–9 December 2016; IEEE: Washington, DC, USA, 2016; pp. 241–248. [Google Scholar]
- Waern, Y.; Ramberg, R. People’s perception of human and computer advice. Comput. Hum. Behav. 1996, 12, 17–27. [Google Scholar] [CrossRef]
- Jian, J.Y.; Bisantz, A.M.; Drury, C.G. Foundations for an empirically determined scale of trust in automated systems. Int. J. Cogn. Ergon. 2000, 4, 53–71. [Google Scholar] [CrossRef]
- Mayer, R.C.; Davis, J.H. The effect of the performance appraisal system on trust for management: A field quasi-experiment. J. Appl. Psychol. 1999, 84, 123–136. [Google Scholar] [CrossRef]
- Le Goues, C.; Holtschulte, N.; Smith, E.K.; Brun, Y.; Devanbu, P.; Forrest, S.; Weimer, W. The ManyBugs and IntroClass Benchmarks for Automated Repair of C Programs. IEEE Trans. Softw. Eng. 2015, 41, 1236–1256. [Google Scholar] [CrossRef]
- Martinez, M.; Durieux, T.; Sommerand, R.; Xuan, J.; Monperrus, M. Automatic repair of real bugs in java: A large-scale experiment on the defects4j dataset. Empir. Softw. Eng. 2018, 22, 1936–1964. [Google Scholar] [CrossRef] [Green Version]
- LASER-UMASSS/AutomatedRepairApplicabilityData. Available online: https://github.com/LASER-UMASS/AutomatedRepairApplicabilityData/blob/master/ManyBugs.csv (accessed on 27 February 2020).
- Pinheiro, J.; Bates, D.; DebRoy, S.; Sarkar, D.; R Core Team. Nlme: Linear and Nonlinear Mixed Effects Models. Available online: https://CRAN.R-project.org/package=nlme (accessed on 6 February 2019).
- R Core Team. R: A language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2018. [Google Scholar]
- Liu, S.; Rovine, M.J.; Molenaar, P. Selecting a linear mixed model for longitudinal data: Repeated measures analysis of variance, covariance pattern model, and growth curve approaches. Psychol. Methods 2012, 17, 15–30. [Google Scholar] [CrossRef]
- Singmann, H.; Bolker, B.; Westfall, J.; Aust, F.; Ben-Shachar, M.S. afex: Analysis of Factorial Experiments. Available online: https://CRAN.R-project.org/package=afex (accessed on 6 February 2019).
- Hervé, M. RVAideMemoire: Testing and Plotting Procedures for Biostatistics. Available online: https://CRAN.R-project.org/package=RVAideMemoire (accessed on 6 February 2019).
- Rusnock, C.F.; Miller, M.E.; Bindewald, J.M. Observations on Trust, Reliance, and Performance Measurement in Human-Automation Team Assessment. In Proceedings of the 2017 Industrial and Systems Engineering Conference, Pittsburgh, PA, USA, 20–23 May 2017; pp. 368–373. [Google Scholar]
- Riley, V. Operator Reliance on Automation: Theory and Data. In Automation and Human Performance: Theory and Applications; Parasuraman, M., Mouloua, M., Eds.; CRC Press: Boca Raton, FL, USA, 1997; pp. 19–35. [Google Scholar]
Variable | M | SD | 1 | 2 | 3 | 4 | 5 | 6 |
---|---|---|---|---|---|---|---|---|
1. Trustworthy.T1 | 4.52 | 1.67 | (0.75) | |||||
2. Trustworthy.T2 | 4.60 | 1.50 | 0.58 ** | (0.71) | ||||
3. Trustworthy.T3 | 4.51 | 1.55 | 0.42 ** | 0.54 ** | (0.80) | |||
4. Trustworthy.T5 | 4.27 | 1.92 | 0.46 ** | 0.42 ** | 0.59 ** | (0.85) | ||
5. TrustIntention.T1 | 3.09 | 0.67 | 0.16 | −0.05 | 0.29 | 0.44 ** | (0.45) | |
6. TrustIntention.T5 | 2.58 | 0.88 | 0.49 ** | 0.46 ** | 0.51 ** | 0.66 ** | 0.51 ** | (0.76) |
Predictor | df | F |
---|---|---|
Time | 3, 131 | 0.51 |
Source | 1, 46 | 10.29 * |
Commented | 1, 46 | 1.09 |
Time × Source | 3, 131 | 1.54 |
Time × Commented | 3, 131 | 0.92 |
Source × Commented | 1, 46 | 0.43 |
Source × Commented × Time | 3, 131 | 0.38 |
Predictor | df | F | η2p |
---|---|---|---|
1. Time | 1, 42 | 17.24 * | 0.29 |
2. Source | 1, 42 | 11.62 * | 0.22 |
3. Commented | 1, 42 | 1.37 | 0.03 |
4. Time × Source | 1, 42 | 1.28 | 0.03 |
5. Time × Commented | 1, 42 | 0.14 | 0.00 |
6. Source × Commented | 1, 42 | 0.22 | 0.01 |
7. Source × Commented × Time | 1, 42 | 0.02 | 0.00 |
Predictor | df | Wald’s χ2 |
---|---|---|
Time | 3 | 2.69 |
Source | 1 | 9.02 * |
Commented | 1 | 1.00 |
Time × Source | 3 | 4.16 |
Time × Commented | 3 | 0.88 |
Source × Commented | 1 | 0.25 |
Source × Commented × Time | 3 | 0.11 |
Reputation | Transparency | Performance | Code | |||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Human | GenProg | Human | GenProg | Human | GenProg | Human | GenProg | |||||||||
Code | (+) | (−) | (+) | (−) | (+) | (−) | (+) | (−) | (+) | (−) | (+) | (−) | (+) | (−) | (+) | (−) |
1 | 0 | 3 | 2 | 1 | 3 | 7 | 0 | 13 | 2 | 6 | 6 | 10 | 0 | 0 | 0 | 0 |
2 | 0 | 0 | 0 | 0 | 3 | 12 | 6 | 14 | 5 | 13 | 8 | 10 | 0 | 2 | 0 | 1 |
3 | 4 | 2 | 0 | 1 | 4 | 17 | 3 | 14 | 8 | 6 | 6 | 16 | 0 | 0 | 0 | 0 |
4 | 0 | 3 | 0 | 2 | 7 | 6 | 1 | 15 | 5 | 7 | 5 | 20 | 1 | 0 | 0 | 0 |
5 | 0 | 5 | 0 | 5 | 6 | 11 | 0 | 25 | 14 | 10 | 3 | 34 | 1 | 1 | 1 | 1 |
Total | 4 | 13 | 2 | 9 | 23 | 53 | 10 | 81 | 34 | 42 | 28 | 90 | 2 | 3 | 1 | 2 |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Alarcon, G.M.; Walter, C.; Gibson, A.M.; Gamble, R.F.; Capiola, A.; Jessup, S.A.; Ryan, T.J. Would You Fix This Code for Me? Effects of Repair Source and Commenting on Trust in Code Repair. Systems 2020, 8, 8. https://doi.org/10.3390/systems8010008
Alarcon GM, Walter C, Gibson AM, Gamble RF, Capiola A, Jessup SA, Ryan TJ. Would You Fix This Code for Me? Effects of Repair Source and Commenting on Trust in Code Repair. Systems. 2020; 8(1):8. https://doi.org/10.3390/systems8010008
Chicago/Turabian StyleAlarcon, Gene M., Charles Walter, Anthony M. Gibson, Rose F. Gamble, August Capiola, Sarah A. Jessup, and Tyler J. Ryan. 2020. "Would You Fix This Code for Me? Effects of Repair Source and Commenting on Trust in Code Repair" Systems 8, no. 1: 8. https://doi.org/10.3390/systems8010008