You seem to have javascript disabled. Please note that many of the page functionalities won't work as expected without javascript enabled.

Search for Articles:

Title / Keyword

Author / Affiliation / Email

Journal

Article Type

Advanced Search

Section

Special Issue

Volume

Issue

Number

Page

Logical OperatorOperator

Search Text

Search Type

Journal Description

Software

Software is an international, peer-reviewed, open access journal on all aspects of software engineering published quarterly online by MDPI.

Open Access— free for readers, with article processing charges (APC) paid by authors or their institutions.
Rapid Publication: manuscripts are peer-reviewed and a first decision is provided to authors approximately 28.9 days after submission; acceptance to publication is undertaken in 4.2 days (median values for papers published in this journal in the first half of 2025).
Recognition of Reviewers: APC discount vouchers, optional signed peer review, and reviewer names published annually in the journal.
Software is a companion journal of Electronics.

Imprint Information Journal Flyer Open Access ISSN: 2674-113X

Latest Articles

44 pages, 900 KB

Open AccessArticle

MetaFFI-Multilingual Indirect Interoperability System

by Tsvi Cherny-Shahar and Amiram Yehudai

Software 2025, 4(3), 21; https://doi.org/10.3390/software4030021 - 26 Aug 2025

The development of software applications using multiple programming languages has increased in recent years, as it allows the selection of the most suitable language and runtime for each component of the system and the integration of third-party libraries. However, this practice involves complexity [...] Read more.

The development of software applications using multiple programming languages has increased in recent years, as it allows the selection of the most suitable language and runtime for each component of the system and the integration of third-party libraries. However, this practice involves complexity and error proneness, due to the absence of an adequate system for the interoperability of multiple programming languages. Developers are compelled to resort to workarounds, such as library reimplementation or language-specific wrappers, which are often dependent on C as the common denominator for interoperability. These challenges render the use of multiple programming languages a burdensome and demanding task that necessitates highly skilled developers for implementation, debugging, and maintenance, and raise doubts about the benefits of interoperability. To overcome these challenges, we propose MetaFFI, introducing a fully in-process, plugin-oriented, runtime-independent architecture based on a minimal C abstraction layer. It provides deep binding without relying on a shared object model, virtual machine bytecode, or manual glue code. This architecture is scalable (O(n) integration for n languages) and supports true polymorphic function and object invocation across languages. MetaFFI is based on leveraging FFI and embedding mechanisms, which minimize restrictions on language selection while still enabling full-duplex binding and deep integration. This is achieved by exploiting the less restrictive shallow binding mechanisms (e.g., Foreign Function Interface) to offer deep binding features (e.g., object creation, methods, fields). MetaFFI provides a runtime-independent framework to load and xcall (Cross-Call) foreign entities (e.g., getters, functions, objects). MetaFFI uses Common Data Types (CDTs) to pass parameters and return values, including objects and complex types, and even cross-language callbacks and dynamic calling conventions for optimization. The indirect interoperability approach of MetaFFI has the significant advantage of requiring only

2 n

mechanisms to support n languages, compared to direct interoperability approaches that need

n^{2}

mechanisms. We developed and tested a proof of concept tool interoperating three languages (Go, Python, and Java), on Windows and Ubuntu. To evaluate the approach and the tool, we conducted a user study, with promising results. The MetaFFI framework is available as open source software, including its full source code and installers, to facilitate adoption and collaboration across academic and industrial communities. Full article

(This article belongs to the Topic Software Engineering and Applications)

► Show Figures

Figure 1

26 pages, 662 KB

Open AccessArticle

Enabling Progressive Server-Side Rendering for Traditional Web Template Engines with Java Virtual Threads

by Bernardo Pereira and Fernando Miguel Carvalho

Software 2025, 4(3), 20; https://doi.org/10.3390/software4030020 - 13 Aug 2025

Modern web applications increasingly demand rendering techniques that optimize performance, responsiveness, and scalability. Progressive Server-Side Rendering (PSSR) bridges the gap between Server-Side Rendering and Client-Side Rendering by progressively streaming HTML content, improving perceived load times. Still, traditional HTML template engines often rely on [...] Read more.

Modern web applications increasingly demand rendering techniques that optimize performance, responsiveness, and scalability. Progressive Server-Side Rendering (PSSR) bridges the gap between Server-Side Rendering and Client-Side Rendering by progressively streaming HTML content, improving perceived load times. Still, traditional HTML template engines often rely on blocking interfaces that hinder their use in asynchronous, non-blocking contexts required for PSSR. This paper analyzes how Java virtual threads, introduced in Java 21, enable non-blocking execution of blocking I/O operations, allowing the reuse of traditional template engines for PSSR without complex asynchronous programming models. We benchmark multiple engines across Spring WebFlux, Spring MVC, and Quarkus using reactive, suspendable, and virtual thread-based approaches. Results show that virtual threads allow blocking engines to scale comparably to those designed for non-blocking I/O, achieving high throughput and responsiveness under load. This demonstrates that virtual threads provide a compelling path to simplify the implementation of PSSR with familiar HTML templates, significantly lowering the barrier to entry while maintaining performance. Full article

(This article belongs to the Topic Software Engineering and Applications)

► Show Figures

Figure 1

27 pages, 4690 KB

Open AccessArticle

Research and Development of Test Automation Maturity Model Building and Assessment Methods for E2E Testing

by Daiju Kato, Ayane Mogi, Hiroshi Ishikawa and Yasufumi Takama

Software 2025, 4(3), 19; https://doi.org/10.3390/software4030019 - 5 Aug 2025

Background: While several test-automation maturity models (e.g., CMMI, TMMi, TAIM) exist, none explicitly integrate ISO 9001-based quality management systems (QMS), leaving a gap for organizations that must align E2E test automation with formal quality assurance. Objective: This study proposes a test-automation maturity model [...] Read more.

Background: While several test-automation maturity models (e.g., CMMI, TMMi, TAIM) exist, none explicitly integrate ISO 9001-based quality management systems (QMS), leaving a gap for organizations that must align E2E test automation with formal quality assurance. Objective: This study proposes a test-automation maturity model (TAMM) that bridges E2E automation capability with ISO 9001/ISO 9004 self-assessment principles, and evaluates its reliability and practical impact in industry. Methods: TAMM comprises eight maturity dimensions, 39 requirements, and 429 checklist items. Three independent assessors applied the checklist to three software teams; inter-rater reliability was ensured via consensus review (Cohen’s κ = 0.75). Short-term remediation actions based on the checklist were implemented over six months and re-assessed. Synergy with the organization’s ISO 9001 QMS was analyzed using ISO 9004 self-check scores. Results: Within 6 months of remediation, mean TAMM score rose from 2.75 → 2.85. Inter-rater reliability is filled with Cohen’s κ = 0.75. Conclusions: The proposed TAMM delivers measurable, short-term maturity gains and complements ISO 9001-based QMS without introducing conflicting processes. Practitioners can use the checklist to identify actionable gaps, prioritize remediation, and quantify progress, while researchers may extend TAMM to other domains or automate scoring via repository mining. Full article

(This article belongs to the Special Issue Software Reliability, Security and Quality Assurance)

► Show Figures

Figure 1

48 pages, 2275 KB

Open AccessArticle

Intersectional Software Engineering as a Field

by Alicia Julia Wilson Takaoka, Claudia Maria Cutrupi and Letizia Jaccheri

Software 2025, 4(3), 18; https://doi.org/10.3390/software4030018 - 30 Jul 2025

Intersectionality is a concept used to explain the power dynamics and inequalities that some groups experience owing to the interconnection of social differences such as in gender, sexual identity, poverty status, race, geographic location, disability, and education. The relation between software engineering, feminism, [...] Read more.

Intersectionality is a concept used to explain the power dynamics and inequalities that some groups experience owing to the interconnection of social differences such as in gender, sexual identity, poverty status, race, geographic location, disability, and education. The relation between software engineering, feminism, and intersectionality has been addressed by some studies thus far, but it has never been codified before. In this paper, we employ the commonly used ABC Framework for empirical software engineering to show the contributions of intersectional software engineering (ISE) as a field of software engineering. In addition, we highlight the power dynamic, unique to ISE studies, and define gender-forward intersectionality as a way to use gender as a starting point to identify and examine inequalities and discrimination. We show that ISE is a field of study in software engineering that uses gender-forward intersectionality to produce knowledge about power dynamics in software engineering in its specific domains and environments. Employing empirical software engineering research strategies, we explain the importance of recognizing and evaluating ISE through four dimensions of dynamics, which are people, processes, products, and policies. Beginning with a set of 10 seminal papers that enable us to define the initial concepts and the query for the systematic mapping study, we conduct a systematic mapping study leads to a dataset of 140 primary papers, of which 15 are chosen as example papers. We apply the principles of ISE to these example papers to show how the field functions. Finally, we conclude the paper by advocating the recognition of ISE as a specialized field of study in software engineering. Full article

(This article belongs to the Special Issue Women’s Special Issue Series: Software)

► Show Figures

Figure 1

16 pages, 396 KB

Open AccessArticle

Investigating Reproducibility Challenges in LLM Bugfixing on the HumanEvalFix Benchmark

by Balázs Szalontai, Balázs Márton, Balázs Pintér and Tibor Gregorics

Software 2025, 4(3), 17; https://doi.org/10.3390/software4030017 - 14 Jul 2025

Benchmark results for large language models often show inconsistencies across different studies. This paper investigates the challenges of reproducing these results in automatic bugfixing using LLMs, on the HumanEvalFix benchmark. To determine the cause of the differing results in the literature, we attempted [...] Read more.

Benchmark results for large language models often show inconsistencies across different studies. This paper investigates the challenges of reproducing these results in automatic bugfixing using LLMs, on the HumanEvalFix benchmark. To determine the cause of the differing results in the literature, we attempted to reproduce a subset of them by evaluating 12 models in the DeepSeekCoder, CodeGemma, CodeLlama, and WizardCoder model families, in different sizes and tunings. A total of 35 unique results were reported for these models across studies, of which we successfully reproduced 12. We identified several relevant factors that influenced the results. The base models can be confused with their instruction-tuned variants, making their results better than expected. Incorrect prompt templates or generation length can decrease benchmark performance, as well as using 4-bit quantization. Using sampling instead of greedy decoding can increase the variance, especially with higher temperature values. We found that precision and 8-bit quantization have less influence on benchmark results. Full article

(This article belongs to the Topic Applications of NLP, AI, and ML in Software Engineering)

► Show Figures

Figure 1

2 pages, 131 KB

Open AccessEditorial

New Editor-in-Chief of Software

by Mirko Viroli

Software 2025, 4(3), 16; https://doi.org/10.3390/software4030016 - 10 Jul 2025

I would like to introduce myself as the new Editor-in-Chief of Software [...] Full article

24 pages, 498 KB

Open AccessArticle

Analysing Concurrent Queues Using CSP: Examining Java’s ConcurrentLinkedQueue

by Kevin Chalmers and Jan Bækgaard Pedersen

Software 2025, 4(3), 15; https://doi.org/10.3390/software4030015 - 7 Jul 2025

In this paper we examine the OpenJDK library implementation of the ConcurrentLinkedQueue. We use model checking to verify that it behaves according to the algorithm it is based on: Michael and Scott’s fast and practical non-blocking concurrent queue algorithm. In addition, we [...] Read more.

In this paper we examine the OpenJDK library implementation of the ConcurrentLinkedQueue. We use model checking to verify that it behaves according to the algorithm it is based on: Michael and Scott’s fast and practical non-blocking concurrent queue algorithm. In addition, we develop a simple concurrent queue specification in CSP and verify that Michael and Scott’s algorithm satisfies it. We conclude that both the algorithm and the implementation are correct and both conform to our simpler concurrent queue specification, which we can use in place of either implementation in future verification tasks. The complete code is available on GitHub. Full article

► Show Figures

Figure 1

56 pages, 1008 KB

Open AccessReview

Machine Learning Techniques for Requirements Engineering: A Comprehensive Literature Review

by António Miguel Rosado da Cruz and Estrela Ferreira Cruz

Software 2025, 4(3), 14; https://doi.org/10.3390/software4030014 - 28 Jun 2025

Software requirements engineering is one of the most critical and time-consuming phases of the software-development process. The lack of communication with stakeholders and the use of natural language for communicating leads to misunderstanding and misidentification of requirements or the creation of ambiguous requirements, [...] Read more.

Software requirements engineering is one of the most critical and time-consuming phases of the software-development process. The lack of communication with stakeholders and the use of natural language for communicating leads to misunderstanding and misidentification of requirements or the creation of ambiguous requirements, which can jeopardize all subsequent steps in the software-development process and can compromise the quality of the final software product. Natural Language Processing (NLP) is an old area of research; however, it is currently undergoing strong and very positive impacts with recent advances in the area of Machine Learning (ML), namely with the emergence of Deep Learning and, more recently, with the so-called transformer models such as BERT and GPT. Software requirements engineering is also being strongly affected by the entire evolution of ML and other areas of Artificial Intelligence (AI). In this article we conduct a systematic review on how AI, ML and NLP are being used in the various stages of requirements engineering, including requirements elicitation, specification, classification, prioritization, requirements management, requirements traceability, etc. Furthermore, we identify which algorithms are most used in each of these stages, uncover challenges and open problems and suggest future research directions. Full article

(This article belongs to the Topic Applications of NLP, AI, and ML in Software Engineering)

► Show Figures

Figure 1

21 pages, 2082 KB

Open AccessArticle

Characterizing Agile Software Development: Insights from a Data-Driven Approach Using Large-Scale Public Repositories

by Carlos Moreno Martínez, Jesús Gallego Carracedo and Jaime Sánchez Gallego

Software 2025, 4(2), 13; https://doi.org/10.3390/software4020013 - 24 May 2025

This study investigates the prevalence and impact of Agile practices by leveraging metadata from thousands of public GitHub repositories through a novel data-driven methodology. To facilitate this analysis, we developed the AgileScore index, a metric designed to identify and evaluate patterns, characteristics, performance [...] Read more.

This study investigates the prevalence and impact of Agile practices by leveraging metadata from thousands of public GitHub repositories through a novel data-driven methodology. To facilitate this analysis, we developed the AgileScore index, a metric designed to identify and evaluate patterns, characteristics, performance and community engagement in Agile-oriented projects. This approach enables comprehensive, large-scale comparisons between Agile methodologies and traditional development practices within digital environments. Our findings reveal a significant annual growth of 16% in the adoption of Agile practices and validate the AgileScore index as a systematic tool for assessing Agile methodologies across diverse development contexts. Furthermore, this study introduces innovative analytical tools for researchers in software project management, software engineering and related fields, providing a foundation for future work in areas such as cost estimation and hybrid project management. These insights contribute to a deeper understanding of Agile’s role in fostering collaboration and adaptability in dynamic digital ecosystems. Full article

► Show Figures

Figure 1

26 pages, 5099 KB

Open AccessArticle

AI Testing for Intelligent Chatbots—A Case Study

by Jerry Gao, Radhika Agarwal and Prerna Garsole

Software 2025, 4(2), 12; https://doi.org/10.3390/software4020012 - 15 May 2025

Cited by 1

The decision tree test method works as a flowchart structure for conversational flow. It has predetermined questions and answers that guide the user through specific tasks. Inspired by principles of the decision tree test method in software engineering, this paper discusses intelligent AI [...] Read more.

The decision tree test method works as a flowchart structure for conversational flow. It has predetermined questions and answers that guide the user through specific tasks. Inspired by principles of the decision tree test method in software engineering, this paper discusses intelligent AI test modeling chat systems, including basic concepts, quality validation, test generation and augmentation, testing scopes, approaches, and needs. The paper’s novelty lies in an intelligent AI test modeling chatbot system built and implemented based on an innovative 3-dimensional AI test model for AI-powered functions in intelligent mobile apps to support model-based AI function testing, test data generation, and adequate test coverage result analysis. As a result, a case study is provided using a mental health and emotional intelligence chatbot system, Wysa. It helps in tracking and analyzing mood and helps in sentiment analysis. Full article

► Show Figures

Figure 1

26 pages, 442 KB

Open AccessArticle

Improving the Fast Fourier Transform for Space and Edge Computing Applications with an Efficient In-Place Method

by Christoforos Vasilakis, Alexandros Tsagkaropoulos, Ioannis Koutoulas and Dionysios Reisis

Software 2025, 4(2), 11; https://doi.org/10.3390/software4020011 - 12 May 2025

Satellite and edge computing designers develop algorithms that restrict resource utilization and execution time. Among these design efforts, optimizing Fast Fourier Transform (FFT), key to many tasks, has led mainly to in-place FFT-specific hardware accelerators. Aiming at improving the FFT performance on processors [...] Read more.

Satellite and edge computing designers develop algorithms that restrict resource utilization and execution time. Among these design efforts, optimizing Fast Fourier Transform (FFT), key to many tasks, has led mainly to in-place FFT-specific hardware accelerators. Aiming at improving the FFT performance on processors and computing devices with limited resources, the current paper enhances the efficiency of the radix-2 FFT by exploring the benefits of an in-place technique. First, we present the advantages of organizing the single memory bank of processors to store two (2) FFT elements in each memory address and provide parallel load and store of each FFT pair of data. Second, we optimize the floating point (FP) and block floating point (BFP) configurations to improve the FFT Signal-to-Noise (SNR) performance and the resource utilization. The resulting techniques reduce the memory requirements by two and significantly improve the time performance for the overall prevailing BFP representation. The execution of inputs ranging from 1K to 16K FFT points, using 8-bit or 16-bit as FP or BFP numbers, on the space-proven Atmel AVR32 and Vision Processing Unit (VPU) Intel Movidius Myriad 2, the edge device Raspberry Pi Zero 2W and a low-cost accelerator on Xilinx Zynq 7000 Field Programmable Gate Array (FPGA), validates the method’s performance improvement. Full article

► Show Figures

Figure 1

29 pages, 6806 KB

Open AccessArticle

Enhancing DevOps Practices in the IoT–Edge–Cloud Continuum: Architecture, Integration, and Software Orchestration Demonstrated in the COGNIFOG Framework

by Kostas Petrakis, Evangelos Agorogiannis, Grigorios Antonopoulos, Themistoklis Anagnostopoulos, Nasos Grigoropoulos, Eleni Veroni, Alexandre Berne, Selma Azaiez, Zakaria Benomar, Harry Kakoulidis, Marios Prasinos, Philippos Sotiriades, Panagiotis Mavrothalassitis and Kosmas Alexopoulos

Software 2025, 4(2), 10; https://doi.org/10.3390/software4020010 - 15 Apr 2025

Cited by 1

This paper presents COGNIFOG, an innovative framework under development that is designed to leverage decentralized decision-making, machine learning, and distributed computing to enable autonomous operation, adaptability, and scalability across the IoT–edge–cloud continuum. The work emphasizes Continuous Integration/Continuous Deployment (CI/CD) practices, development, and versatile [...] Read more.

This paper presents COGNIFOG, an innovative framework under development that is designed to leverage decentralized decision-making, machine learning, and distributed computing to enable autonomous operation, adaptability, and scalability across the IoT–edge–cloud continuum. The work emphasizes Continuous Integration/Continuous Deployment (CI/CD) practices, development, and versatile integration infrastructures. The described methodology ensures efficient, reliable, and seamless integration of the framework, offering valuable insights into integration design, data flow, and the incorporation of cutting-edge technologies. Through three real-world trials in smart cities, e-health, and smart manufacturing and the development of a comprehensive QuickStart Guide for deployment, this work highlights the efficiency and adaptability of the COGNIFOG platform, presenting a robust solution for addressing the complexities of next-generation computing environments. Full article

► Show Figures

Figure 1

19 pages, 1575 KB

Open AccessArticle

Regression Testing in Agile—A Systematic Mapping Study

by Suddhasvatta Das and Kevin Gary

Software 2025, 4(2), 9; https://doi.org/10.3390/software4020009 - 14 Apr 2025

Background: Regression testing is critical in agile software development, as it ensures that frequent changes do not introduce defects into previously working functionalities. While agile methodologies emphasize rapid iterations and value delivery, regression testing research has predominantly focused on optimizing technical efficiency [...] Read more.

Background: Regression testing is critical in agile software development, as it ensures that frequent changes do not introduce defects into previously working functionalities. While agile methodologies emphasize rapid iterations and value delivery, regression testing research has predominantly focused on optimizing technical efficiency rather than aligning with agile principles. Aim: This study aims to systematically map research trends and gaps in regression testing within agile environments, identifying areas that require further exploration to enhance alignment with agile practices and value-driven outcomes. Method: A systematic mapping study analyzed 35 primary studies. The research categorized studies based on their focus areas, evaluation metrics, agile frameworks, and methodologies, providing a comprehensive overview of the field. Results: The findings strongly emphasize test prioritization and selection, reflecting the need for optimized fault detection and execution efficiency in agile workflows. However, areas such as test generation, test minimization, and cost analysis are under-explored. Current evaluation metrics primarily address technical outcomes, neglecting agile-specific aspects like defect severity’s business impact and iterative workflows. Additionally, the research highlights the dominance of continuous integration frameworks, with limited attention to other agile practices like Scrum and a lack of datasets capturing agile-specific attributes such as testing costs and user story importance. Conclusions: This study underscores the need for research to expand beyond existing focus areas, exploring diverse testing techniques and developing agile-centric metrics and datasets. By addressing these gaps, future work can enhance the applicability of regression testing strategies and align them more closely with agile development principles. Full article

► Show Figures

Figure 1

16 pages, 2662 KB

Open AccessArticle

Uplifting Moods: Augmented Reality-Based Gamified Mood Intervention App with Attention Bias Modification

by Yun Jung Yeh, Sarah S. Jo and Youngjun Cho

Software 2025, 4(2), 8; https://doi.org/10.3390/software4020008 - 1 Apr 2025

Cited by 1

Attention Bias Modification (ABM) is a cost-effective mood intervention that has the potential to be used in daily settings beyond clinical environments. However, its interactivity and user engagement are known to be limited and underexplored. Here, we propose Uplifting Moods, a novel mood [...] Read more.

Attention Bias Modification (ABM) is a cost-effective mood intervention that has the potential to be used in daily settings beyond clinical environments. However, its interactivity and user engagement are known to be limited and underexplored. Here, we propose Uplifting Moods, a novel mood intervention app that combines gamified ABM and augmented reality (AR) to address the limitation associated with the repetitive nature of ABM. By harnessing the benefits of mobile AR’s low-cost, portable, and accessible characteristics, this approach is to help users easily take part in ABM, positively shifting one’s emotions. We conducted a mixed methods study with 24 participants, which involves a controlled experiment with Self-Assessment Manikin as its primary measure and a semi-structured interview. Our analysis reports that the approach uniquely adds fun, exploring, and challenging features, helping improve engagement and feeling more cheerful and less under control. It also highlights the importance of personalization and consideration of gaming style, music preference, and socialization in designing a daily AR ABM game as an effective mental wellbeing intervention. Full article

► Show Figures

Figure 1

34 pages, 2285 KB

Open AccessArticle

Empirical Analysis of Data Sampling-Based Decision Forest Classifiers for Software Defect Prediction

by Fatima Enehezei Usman-Hamza, Abdullateef Oluwagbemiga Balogun, Hussaini Mamman, Luiz Fernando Capretz, Shuib Basri, Rafiat Ajibade Oyekunle, Hammed Adeleye Mojeed and Abimbola Ganiyat Akintola

Software 2025, 4(2), 7; https://doi.org/10.3390/software4020007 - 21 Mar 2025

The strategic significance of software testing in ensuring the success of software development projects is paramount. Comprehensive testing, conducted early and consistently across the development lifecycle, is vital for mitigating defects, especially given the constraints on time, budget, and other resources often faced [...] Read more.

The strategic significance of software testing in ensuring the success of software development projects is paramount. Comprehensive testing, conducted early and consistently across the development lifecycle, is vital for mitigating defects, especially given the constraints on time, budget, and other resources often faced by development teams. Software defect prediction (SDP) serves as a proactive approach to identifying software components that are most likely to be defective. By predicting these high-risk modules, teams can prioritize thorough testing and inspection, thereby preventing defects from escalating to later stages where resolution becomes more resource intensive. SDP models must be continuously refined to improve predictive accuracy and performance. This involves integrating clean and preprocessed datasets, leveraging advanced machine learning (ML) methods, and optimizing key metrics. Statistical-based and traditional ML approaches have been widely explored for SDP. However, statistical-based models often struggle with scalability and robustness, while conventional ML models face challenges with imbalanced datasets, limiting their prediction efficacy. In this study, innovative decision forest (DF) models were developed to address these limitations. Specifically, this study evaluates the cost-sensitive forest (CS-Forest), forest penalizing attributes (FPA), and functional trees (FT) as DF models. These models were further enhanced using homogeneous ensemble techniques, such as bagging and boosting techniques. The experimental analysis on benchmark SDP datasets demonstrates that the proposed DF models effectively handle class imbalance, accurately distinguishing between defective and non-defective modules. Compared to baseline and state-of-the-art ML and deep learning (DL) methods, the suggested DF models exhibit superior prediction performance and offer scalable solutions for SDP. Consequently, the application of DF-based models is recommended for advancing defect prediction in software engineering and similar ML domains. Full article

► Show Figures

Figure 1

18 pages, 546 KB

Open AccessReview

Designing Microservices Using AI: A Systematic Literature Review

by Daniel Narváez, Nicolas Battaglia, Alejandro Fernández and Gustavo Rossi

Software 2025, 4(1), 6; https://doi.org/10.3390/software4010006 - 19 Mar 2025

Cited by 2

Microservices architecture has emerged as a dominant approach for developing scalable and modular software systems, driven by the need for agility and independent deployability. However, designing these architectures poses significant challenges, particularly in service decomposition, inter-service communication, and maintaining data consistency. To address [...] Read more.

Microservices architecture has emerged as a dominant approach for developing scalable and modular software systems, driven by the need for agility and independent deployability. However, designing these architectures poses significant challenges, particularly in service decomposition, inter-service communication, and maintaining data consistency. To address these issues, artificial intelligence (AI) techniques, such as machine learning (ML) and natural language processing (NLP), have been applied with increasing frequency to automate and enhance the design process. This systematic literature review examines the application of AI in microservices design, focusing on AI-driven tools and methods for improving service decomposition, decision-making, and architectural validation. This review analyzes research studies published between 2018 and 2024 that specifically focus on the application of AI techniques in microservices design, identifying key AI methods used, challenges encountered in integrating AI into microservices, and the emerging trends in this research area. The findings reveal that AI has effectively been used to optimize performance, automate design tasks, and mitigate some of the complexities inherent in microservices architectures. However, gaps remain in areas such as distributed transactions and security. The study concludes that while AI offers promising solutions, further empirical research is needed to refine AI’s role in microservices design and address the remaining challenges. Full article

► Show Figures

Figure 1

20 pages, 4758 KB

Open AccessArticle

A Systematic Approach for Assessing Large Language Models’ Test Case Generation Capability

by Hung-Fu Chang and Mohammad Shokrolah Shirazi

Software 2025, 4(1), 5; https://doi.org/10.3390/software4010005 - 10 Mar 2025

Software testing ensures the quality and reliability of software products, but manual test case creation is labor-intensive. With the rise of Large Language Models (LLMs), there is growing interest in unit test creation with LLMs. However, effective assessment of LLM-generated test cases is [...] Read more.

Software testing ensures the quality and reliability of software products, but manual test case creation is labor-intensive. With the rise of Large Language Models (LLMs), there is growing interest in unit test creation with LLMs. However, effective assessment of LLM-generated test cases is limited by the lack of standardized benchmarks that comprehensively cover diverse programming scenarios. To address the assessment of an LLM’s test case generation ability and lacking a dataset for evaluation, we propose the Generated Benchmark from Control-Flow Structure and Variable Usage Composition (GBCV) approach, which systematically generates programs used for evaluating LLMs’ test generation capabilities. By leveraging basic control-flow structures and variable usage, GBCV provides a flexible framework to create a spectrum of programs ranging from simple to complex. Because GPT-4o and GPT-3.5-Turbo are publicly accessible models, to present real-world regular users’ use cases, we use GBCV to assess LLM performance on them. Our findings indicate that GPT-4o performs better on composite program structures, while all models effectively detect boundary values in simple conditions but face challenges with arithmetic computations. This study highlights the strengths and limitations of LLMs in test generation, provides a benchmark framework, and suggests directions for future improvement. Full article

► Show Figures

Figure 1

42 pages, 845 KB

Open AccessArticle

On the Execution and Runtime Verification of UML Activity Diagrams

by François Siewe and Guy Merlin Ngounou

Software 2025, 4(1), 4; https://doi.org/10.3390/software4010004 - 27 Feb 2025

The unified modelling language (UML) is an industrial de facto standard for system modelling. It consists of a set of graphical notations (also known as diagrams) and has been used widely in many industrial applications. Although the graphical nature of UML is appealing [...] Read more.

The unified modelling language (UML) is an industrial de facto standard for system modelling. It consists of a set of graphical notations (also known as diagrams) and has been used widely in many industrial applications. Although the graphical nature of UML is appealing to system developers, the official documentation of UML does not provide formal semantics for UML diagrams. This makes UML unsuitable for formal verification and, therefore, limited when it comes to the development of safety/security-critical systems where faults can cause damage to people, properties, or the environment. The UML activity diagram is an important UML graphical notation, which is effective in modelling the dynamic aspects of a system. This paper proposes a formal semantics for UML activity diagrams based on the calculus of context-aware ambients (CCA). An algorithm (semantic function) is proposed that maps any activity diagram onto a process in CCA, which describes the behaviours of the UML activity diagram. This process can then be executed and formally verified using the CCA simulation tool ccaPL and the CCA runtime verification tool ccaRV. Hence, design flaws can be detected and fixed early during the system development lifecycle. The pragmatics of the proposed approach are demonstrated using a case study in e-commerce. Full article

(This article belongs to the Topic Software Engineering and Applications)

► Show Figures

Figure 1

23 pages, 602 KB

Open AccessArticle

The Scalable Detection and Resolution of Data Clumps Using a Modular Pipeline with ChatGPT

by Nils Baumgartner, Padma Iyenghar, Timo Schoemaker and Elke Pulvermüller

Software 2025, 4(1), 3; https://doi.org/10.3390/software4010003 - 2 Feb 2025

This paper explores a modular pipeline architecture that integrates ChatGPT, a Large Language Model (LLM), to automate the detection and refactoring of data clumps—a prevalent type of code smell that complicates software maintainability. Data clumps refer to clusters of code that are often [...] Read more.

This paper explores a modular pipeline architecture that integrates ChatGPT, a Large Language Model (LLM), to automate the detection and refactoring of data clumps—a prevalent type of code smell that complicates software maintainability. Data clumps refer to clusters of code that are often repeated and should ideally be refactored to improve code quality. The pipeline leverages ChatGPT’s capabilities to understand context and generate structured outputs, making it suitable for addressing complex software refactoring tasks. Through systematic experimentation, our study not only addresses the research questions outlined but also demonstrates that the pipeline can accurately identify data clumps, particularly excelling in cases that require semantic understanding—where localized clumps are embedded within larger codebases. While the solution significantly enhances the refactoring workflow, facilitating the management of distributed clumps across multiple files, it also presents challenges such as occasional compiler errors and high computational costs. Feedback from developers underscores the usefulness of LLMs in software development but also highlights the essential role of human oversight to correct inaccuracies. These findings demonstrate the pipeline’s potential to enhance software maintainability, offering a scalable and efficient solution for addressing code smells in real-world projects, and contributing to the broader goal of enhancing software maintainability in large-scale projects. Full article

(This article belongs to the Topic Applications of NLP, AI, and ML in Software Engineering)

► Show Figures

Figure 1

16 pages, 650 KB

Open AccessArticle

German Translation and Psychometric Analysis of the SOLID-SD: A German Inventory for Assessing Security Culture in Software Companies

by Christina Glasauer, Hollie N. Pearl and Rainer W. Alexandrowicz

Software 2025, 4(1), 2; https://doi.org/10.3390/software4010002 - 24 Jan 2025

The SOLID-S is an inventory assessing six dimensions of organizational (software) security culture, which is currently available in English. Here, we present the German version, SOLID-SD, along with its translation process and psychometric analysis. With a partial credit model based on a sample [...] Read more.

The SOLID-S is an inventory assessing six dimensions of organizational (software) security culture, which is currently available in English. Here, we present the German version, SOLID-SD, along with its translation process and psychometric analysis. With a partial credit model based on a sample of N = 280 persons, we found, overall, highly satisfactory measurement properties for the instrument. There were no threshold permutations, no serious differential item functioning, and good item fits. The subscales’ internal consistencies and the inter-scale correlations show very high similarities between the SOLID-SD and the original English version, indicating a successful translation of the instrument. Full article

(This article belongs to the Special Issue Software Reliability, Security and Quality Assurance)

► Show Figures

Figure 1

More Articles...

Submit to Software Review for Software

Journal Menu

Journal Browser

► Journal Browser

Highly Accessed Articles

View More...

Latest Books

More Books and Reprints...

E-Mail Alert

News

3 September 2025
Join Us at the MDPI at the University of Toronto Career Fair, 23 September 2025, Toronto, ON, Canada

1 September 2025
MDPI INSIGHTS: The CEO’s Letter #26 – CUJS, Head of Ethics, Open Peer Review, AIS 2025, Reviewer Recognition

31 July 2025
MDPI INSIGHTS: The CEO's Letter #25 - 8,000 Staff Worldwide, Korea Visit, 100,000 Preprints, Malaysia Roundtable, Canada Consortium Deal

More News & Announcements...

Topics

Propose a Topic

Topic in Applied Sciences, Electronics, Informatics, Information, Software

Software Engineering and Applications Topic Editors: Sanjay Misra, Robertas Damaševičius, Bharti Suri
Deadline: 31 October 2025

Topic in Applied Sciences, ASI, Blockchains, Computers, MAKE, Software

Recent Advances in AI-Enhanced Software Engineering and Web Services Topic Editors: Hai Wang, Zhe Hou
Deadline: 31 May 2026

Conferences

Propose a Conference Collaboration

24–26 October 2025 7th World Symposium on Software Engineering (WSSE 2025)

More Conferences...

Special Issues

Propose a Special Issue

Special Issue in Software

Software Reliability, Security and Quality Assurance Guest Editors: Tadashi Dohi, Junjun Zheng, Xiao-Yi Zhang
Deadline: 25 December 2025

Special Issue in Software

Women’s Special Issue Series: Software Guest Editors: Tingting Bi, Xing Hu, Letizia Jaccheri
Deadline: 31 December 2025

Back to TopTop