Submit to Software Review for Software

Journal Menu

Journal Browser

Software, Volume 4, Issue 1 (March 2025) – 6 articles

Cover Story (view full-size image): High software costs are often driven by maintenance. Code smells hint at design issues, and refactoring them can help to reduce long-term costs. One specific type of code smell, known as data clumps, refers to recurring, tightly connected groups of variables that may appear across entire projects. Automating their detection and resolution—separating them from their unfortunate entanglement—requires both structural and semantic understanding, including the appropriate naming and placing of components. We present a pipeline combining deterministic analysis with the semantic capabilities of Large Language Models. In a supervised experiment with real-world projects, the developers confirmed preserved functionality and suitable naming and placement, as well as identified areas for refinement in overall code improvements. View this paper

Issues are regarded as officially published after their release is announced to the table of contents alert mailing list.
You may sign up for e-mail alerts to receive table of contents of newly released issues.
PDF is the official format for papers published in both, html and pdf forms. To view the papers in pdf format, click on the "PDF Full-text" link, and use the free Adobe Reader to open them.

Order results

Result details

Show export options Show export options

Select all

Export citation of selected articles as:

18 pages, 546 KB

Open AccessReview

Designing Microservices Using AI: A Systematic Literature Review

by Daniel Narváez, Nicolas Battaglia, Alejandro Fernández and Gustavo Rossi

Software 2025, 4(1), 6; https://doi.org/10.3390/software4010006 - 19 Mar 2025

Cited by 2 | Viewed by 9239

Abstract

Microservices architecture has emerged as a dominant approach for developing scalable and modular software systems, driven by the need for agility and independent deployability. However, designing these architectures poses significant challenges, particularly in service decomposition, inter-service communication, and maintaining data consistency. To address these issues, artificial intelligence (AI) techniques, such as machine learning (ML) and natural language processing (NLP), have been applied with increasing frequency to automate and enhance the design process. This systematic literature review examines the application of AI in microservices design, focusing on AI-driven tools and methods for improving service decomposition, decision-making, and architectural validation. This review analyzes research studies published between 2018 and 2024 that specifically focus on the application of AI techniques in microservices design, identifying key AI methods used, challenges encountered in integrating AI into microservices, and the emerging trends in this research area. The findings reveal that AI has effectively been used to optimize performance, automate design tasks, and mitigate some of the complexities inherent in microservices architectures. However, gaps remain in areas such as distributed transactions and security. The study concludes that while AI offers promising solutions, further empirical research is needed to refine AI’s role in microservices design and address the remaining challenges. Full article

► Show Figures

Figure 1

20 pages, 4758 KB

Open AccessArticle

A Systematic Approach for Assessing Large Language Models’ Test Case Generation Capability

by Hung-Fu Chang and Mohammad Shokrolah Shirazi

Software 2025, 4(1), 5; https://doi.org/10.3390/software4010005 - 10 Mar 2025

Cited by 2 | Viewed by 3590

Abstract

Software testing ensures the quality and reliability of software products, but manual test case creation is labor-intensive. With the rise of Large Language Models (LLMs), there is growing interest in unit test creation with LLMs. However, effective assessment of LLM-generated test cases is limited by the lack of standardized benchmarks that comprehensively cover diverse programming scenarios. To address the assessment of an LLM’s test case generation ability and lacking a dataset for evaluation, we propose the Generated Benchmark from Control-Flow Structure and Variable Usage Composition (GBCV) approach, which systematically generates programs used for evaluating LLMs’ test generation capabilities. By leveraging basic control-flow structures and variable usage, GBCV provides a flexible framework to create a spectrum of programs ranging from simple to complex. Because GPT-4o and GPT-3.5-Turbo are publicly accessible models, to present real-world regular users’ use cases, we use GBCV to assess LLM performance on them. Our findings indicate that GPT-4o performs better on composite program structures, while all models effectively detect boundary values in simple conditions but face challenges with arithmetic computations. This study highlights the strengths and limitations of LLMs in test generation, provides a benchmark framework, and suggests directions for future improvement. Full article

► Show Figures

Figure 1

42 pages, 845 KB

Open AccessArticle

On the Execution and Runtime Verification of UML Activity Diagrams

by François Siewe and Guy Merlin Ngounou

Software 2025, 4(1), 4; https://doi.org/10.3390/software4010004 - 27 Feb 2025

Viewed by 4557

Abstract

The unified modelling language (UML) is an industrial de facto standard for system modelling. It consists of a set of graphical notations (also known as diagrams) and has been used widely in many industrial applications. Although the graphical nature of UML is appealing to system developers, the official documentation of UML does not provide formal semantics for UML diagrams. This makes UML unsuitable for formal verification and, therefore, limited when it comes to the development of safety/security-critical systems where faults can cause damage to people, properties, or the environment. The UML activity diagram is an important UML graphical notation, which is effective in modelling the dynamic aspects of a system. This paper proposes a formal semantics for UML activity diagrams based on the calculus of context-aware ambients (CCA). An algorithm (semantic function) is proposed that maps any activity diagram onto a process in CCA, which describes the behaviours of the UML activity diagram. This process can then be executed and formally verified using the CCA simulation tool ccaPL and the CCA runtime verification tool ccaRV. Hence, design flaws can be detected and fixed early during the system development lifecycle. The pragmatics of the proposed approach are demonstrated using a case study in e-commerce. Full article

(This article belongs to the Topic Software Engineering and Applications)

► Show Figures

Figure 1

23 pages, 602 KB

Open AccessArticle

The Scalable Detection and Resolution of Data Clumps Using a Modular Pipeline with ChatGPT

by Nils Baumgartner, Padma Iyenghar, Timo Schoemaker and Elke Pulvermüller

Software 2025, 4(1), 3; https://doi.org/10.3390/software4010003 - 2 Feb 2025

Cited by 1 | Viewed by 2116

Abstract

This paper explores a modular pipeline architecture that integrates ChatGPT, a Large Language Model (LLM), to automate the detection and refactoring of data clumps—a prevalent type of code smell that complicates software maintainability. Data clumps refer to clusters of code that are often repeated and should ideally be refactored to improve code quality. The pipeline leverages ChatGPT’s capabilities to understand context and generate structured outputs, making it suitable for addressing complex software refactoring tasks. Through systematic experimentation, our study not only addresses the research questions outlined but also demonstrates that the pipeline can accurately identify data clumps, particularly excelling in cases that require semantic understanding—where localized clumps are embedded within larger codebases. While the solution significantly enhances the refactoring workflow, facilitating the management of distributed clumps across multiple files, it also presents challenges such as occasional compiler errors and high computational costs. Feedback from developers underscores the usefulness of LLMs in software development but also highlights the essential role of human oversight to correct inaccuracies. These findings demonstrate the pipeline’s potential to enhance software maintainability, offering a scalable and efficient solution for addressing code smells in real-world projects, and contributing to the broader goal of enhancing software maintainability in large-scale projects. Full article

(This article belongs to the Topic Applications of NLP, AI, and ML in Software Engineering)

► Show Figures

Figure 1

16 pages, 650 KB

Open AccessArticle

German Translation and Psychometric Analysis of the SOLID-SD: A German Inventory for Assessing Security Culture in Software Companies

by Christina Glasauer, Hollie N. Pearl and Rainer W. Alexandrowicz

Software 2025, 4(1), 2; https://doi.org/10.3390/software4010002 - 24 Jan 2025

Viewed by 980

Abstract

The SOLID-S is an inventory assessing six dimensions of organizational (software) security culture, which is currently available in English. Here, we present the German version, SOLID-SD, along with its translation process and psychometric analysis. With a partial credit model based on a sample of N = 280 persons, we found, overall, highly satisfactory measurement properties for the instrument. There were no threshold permutations, no serious differential item functioning, and good item fits. The subscales’ internal consistencies and the inter-scale correlations show very high similarities between the SOLID-SD and the original English version, indicating a successful translation of the instrument. Full article

(This article belongs to the Special Issue Software Reliability, Security and Quality Assurance)

► Show Figures

Figure 1

25 pages, 4974 KB

Open AccessArticle

A Common Language of Software Evolution in Repositories (CLOSER)

by Jordan Garrity and David Cutting

Software 2025, 4(1), 1; https://doi.org/10.3390/software4010001 - 6 Jan 2025

Viewed by 1686

Abstract

Version Control Systems (VCSs) are used by development teams to manage the collaborative evolution of source code, and there are several widely used industry standard VCSs. In addition to the code files themselves, metadata about the changes made are also recorded by the VCS, and this is often used with analytical tools to provide insight into the software development, a process known as Mining Software Repositories (MSRs). MSR tools are numerous but most often limited to one VCS format and, therefore, restricted in their scope of application in addition to the initial effort required to implement parsers for verbose textual VCS output. To address this limitation, a domain-specific language (DSL), the Common Language of Software Evolution in Repositories (CLOSER), was defined that abstracted away from specific implementations while isomorphically mapping to the data model of all major VCS formats. Using CLOSER directly as a data model or as an intermediate stage in a conversion analysis approach could make use of all major repositories rather than be limited to a single format. The initial barrier to adoption for MSR approaches was also lowered as CLOSER output is a concise, easily machine-readable format. CLOSER was implemented in tooling and tested against a number of common expected use cases, including a direct use in MSR analysis, proving the fidelity of the model and implementation. CLOSER was also successfully used to convert raw output logs from one VCS format to another, offering the possibility that legacy analysis tools could be used on other technologies without any changes being required. In addition to the advantages of a generic model opening all major VCS formats for analysis parsing, the CLOSER format was found to require less code and complete parsing faster than traditional VCS logging outputs. Full article

► Show Figures