Context: Modern software systems have become increasingly complex, making it difficult to interpret raw requirements and effectively utilize traditional tools for software design and analysis. Unified Modeling Language (UML) class diagrams are widely used to visualize and understand system architecture, but analyzing them
[...] Read more.
Context: Modern software systems have become increasingly complex, making it difficult to interpret raw requirements and effectively utilize traditional tools for software design and analysis. Unified Modeling Language (UML) class diagrams are widely used to visualize and understand system architecture, but analyzing them manually, especially for large-scale systems, poses significant challenges.
Objectives: This study aims to automate the analysis of UML class diagrams by assessing their complexity using a machine learning approach. The goal is to support software developers in identifying potential design issues early in the development process and to improve overall software quality.
Methodology: To achieve this, this research introduces a
Visual Question Answering (VQA)-based framework that integrates both computer vision and natural language processing. Vision Transformers (ViTs) are employed to extract global visual features from UML class diagrams, while the BERT language model processes natural language queries. By combining these two models, the system can accurately respond to questions related to software complexity, such as class coupling and inheritance depth.
Results: The proposed method demonstrated strong performance in experimental trials. The ViT model achieved an accuracy of 0.8800, with both the F1 score and recall reaching 0.8985. These metrics highlight the effectiveness of the approach in automatically evaluating UML class diagrams.
Conclusions: The findings confirm that advanced machine learning techniques can be successfully applied to automate software design analysis. This approach can help developers detect design flaws early and enhance software maintainability. Future work will explore advanced fusion strategies, novel data augmentation techniques, and lightweight model adaptations suitable for environments with limited computational resources.
Full article