1. Introduction
A usually discussed issue in software evolution is the tradeoff between short-term and long-term goals. Very often, the need to fast delivery to stakeholders can lead to design problems and rework costs. Specifically, design violations taken to deliver fast might delay future development, compromising software maintainability, and evolution.
In the literature, there are numerous researches about design problems and their impact on software project evolution, highlighting that these problems increase the cost of change over time until software becomes almost un-maintainable. Design smells have been defined as indicators of such problems. In particular, the presence of design smells could indicate the use of constructs that are harmful to system maintenance activities. Architectural smells are ultimately instances of poor design decisions [
1,
2], at the architectural level.
Design and architectural smells negatively affect system life cycle properties, such as understandability, testability, extensibility, and re-usability [
3]. Indeed, design smells are closely related to system evolution, as often an organization necessarily deals with accumulated design problems when adding new features to a software system. Unmanaged, design smells can lead to significant technical problems and increased maintenance and evolution efforts.
More specifically, during the evolution of the software, the quantity and complexity of the interactions between the software elements increase, with a consequent effect on the project structure.
This paper proposes an empirical study conducted on the evolution history of eight software systems, analyzing 17,252 commits.
In particular, by focusing on addiction-related smells, an investigation was conducted into the evolution of design smells, through the detection of instances of multiple types of design smells.
Among all the smells defined in the literature [
4], the focus of this article is on abstraction design smells, encapsulation design smells, modularization design smells, and hierarchy design smells.
The impacts of these smells on the refactoring number and the relative modifications carried out on a software system were studied.
The identification of refactoring activities has been used to see if this brings an improvement in the design of existing source code.
The empirical study conducted confirmed that the classes affected by design smells are more subject to changes, highlighting that especially when more smells are detected in the same classes, these are more frequently subject to changes.
This result may be due to the fact that, most smells, are introduced once the affected class is transferred to the repository for the first time. In addition, it emerged that the removal of design smells occurs simultaneously for multiple smells.
Finally, the results showed that the removal of smells is not related to the presence of refactoring.
The rest of the paper is organized as follow: in
Section 2, some background material on software design, open source software development, and mining software repositories are provided,
Section 3 discusses related works in the literature,
Section 4 describes empirical study design, while the results of the study are reported in
Section 5,
Section 6 discusses the threats that could affect the obtained results, finally, conclusions are given in the last section.
2. Background
2.1. Software Design
Thanks to the numerous studies carried out, software design turns out to be a critical problem, capable of strongly influencing the quality of the software. The presence of design problems and incorrect constructions can negatively affect the main features of the software, such as the comprehensibility, testability, extensibility, reusability, and maintainability of the software [
5].
A very useful indicator of design problems is knowledge of smells [
6].
In literature, based on the extent of the impact caused by smell, smells can be divided into three main categories: code smell, architectural smell, and design smell.
Architectural smells and code smells are indicators of bad code or design that can lead to quality problems, such as breakdowns, technical debt, or difficulties in maintenance and evolution.
Smell codes usually have a limited local impact, because they involve a class or a file and require only simple refactoring activities. On the other hand, however, the architectural smells concern the system level and represent the violation of the design principles or of the decisions that affect the internal qualities of the software resulting in negative effects on maintenance and evolution costs.
Design smells indicate design aspects that violate fundamental principles and negatively affect software design. Furthermore, it is not only their knowledge that is fundamental but also their management, because being related to the evolution of the system, the smell of the incorrectly managed design leads to significant technical problems and increases maintenance and evolution efforts by developers.
In particular, this study focuses on design smells, categorized into four groups on the base of their similarity: abstraction, encapsulation, modularization, and hierarchy.
It has been chosen to focus attention on design smells because in the literature there are numerous studies on other types of smells, instead, some issues related to design smells remain unsolved.
More specifically, this study considered: the imperative abstraction smell that arises when an operation is transformed into a class; unnecessary abstraction is a bad smell that introduces an abstraction that is not necessary for software design; unutilized abstraction is an abstraction that is not used, finally, a multifaceted abstraction happens when the elements of the abstraction are not cohesive.
A deficient encapsulation smell occurs when the declared accessibility of one or more members of abstraction is more permissive than required, and an unexploited encapsulation smell occurs when the client class relies on the use of explicit controls, instead of taking advantage of the variation of the types already encapsulated in a hierarchy.
Broken modularization refers to the presence of separate and diffused data and/or methods among multiple abstractions, which instead should have been localized in a single abstraction; insufficient modularization refers to a large or complex abstraction, which could be further modular; the hub-like modularization smell occurs when an abstraction has dependencies with a large number of other abstractions, and cyclic dependent modularization is when two or more abstractions create a close coupling between the abstractions, depending on each other directly or indirectly.
Wide hierarchy is a smell that occurs when intermediate types are missing because a hereditary hierarchy is too large; broken hierarchy occurs when a supertype and its subtype conceptually do not share an “IS-A” relationship resulting in interrupted substitutability; deep hierarchy arises when there is an excessively deep hereditary hierarchy; multipath hierarchy refers to the presence of a subtype that inherits from a supertype both directly and indirectly from a supertype which leads to unnecessary inheritance paths in the hierarchy; cyclic hierarchy occurs when a supertype in a hierarchy depends on any of its subtypes; rebellious hierarchy is a smell that arises when a subtype rejects the methods provided by its supertypes, instead missing hierarchy arises when conditional logic is used by a design segment to explicitly manage the variation in the behavior in which a hierarchy could have been created and used to encapsulate those variations.
As previously mentioned, another fundamental aspect of software design concern refactoring, whose idea is to recognize that it is difficult to make good code and good design from the beginning and, as the requirements change, the design must be changed.
Refactoring, originally introduced by Martin Fowler, is generally motivated by the detection of a code smell, for example, a method may appear excessively long and complex, or contain a lot of duplicate code in another method [
4].
Refactoring, therefore, provides techniques for evolving design in small steps. The principles of refactoring are based on changing the internal structure of the software system to make it easier and understandable, without changing its observable behavior.
It is applied to improve some non-functional features of the software such as readability, maintainability, reusability, extensibility of the code as well as the reduction of its complexity, possibly through the subsequent introduction of design patterns.
3. Related Works
The impact of code level smells defined by Fowler [
4] has been widely investigated in the research literature. In particular, several studies analyzed their effects on maintainability [
11,
12], program comprehension [
13], change and fault-proneness [
14,
15]. Currently, there are also several tools used to automatically detect [
16,
17] code smells, exploiting different sources of information. Several papers discuss code smells fix through the application of refactoring operations [
18,
19].
At architectural and design levels smells are ultimately instances of poor design decisions [
2]. The presence of construct design problems contributes to the system erosion [
1,
2]. These smells have a negative impact on system life cycle properties, such as understandability, testability, extensibility, and re-usability [
3].
Several research papers in the literature deal with the detection of architectural smells, and part also with the influence of architectural smell on issue related activities.
The work of Le et al. [
20] presented one of the largest empirical studies on architectural changes in long-lasting software systems, based on the analysis of 14 systems at the version level, thanks to the use of ARCADE, a software workbench that allows detecting different aspects of architectural change.
These results show that the versioning scheme of a system is not an accurate indicator of architectural change and that the architecture of a system may be relatively unstable in view of a release.
Fontana et al. [
21] conducted a large-scale empirical study based on the investigation of the correlations between code smells and architecture smells. Specifically, they investigated 102 Java projects, using the SonarQube plug-in “Antipatterns-CodeSmells” for code odors and Arcan for the architecture of code smells, with the aim of understanding whether the architecture smells are independent of smell code or derive from each other. Their results show that there is no correlation between the two categories of smell because the presence of code smells does not directly imply the presence of architectural smells and vice versa.
In [
22], it is described a system-level multiple refactoring algorithm, which can identify the move method, move field, and extract class refactoring opportunities automatically according to the principle of high cohesion and low coupling. The algorithm works by merging and splitting related classes to obtain the optimal functionality distribution from the system-level.
Brunet et al. [
23] studied the evolution of architectural violations in 76 versions selected from four subject systems showing like the number of architectural violations is constantly growing over time, some previously identified violations reappear, and in all the systems studied a critical core is identified and this core does not change over time.
Arcan [
24] is a static analysis tool targeted at the detection of three architectural smells, including cycles and hubs. Arcan creates a graph database containing the structural dependencies of a Java system and then runs several detection algorithms (one per smell) on this graph. At last, there are some commercial tools for detecting architectural smells, such as Designite [
25], which identifies seven architecture smells, including cycles and other dependency-based smells. As far as we are aware, all the previous tools have no predictive capabilities.
Pietrzak and Walter [
26] analyzed different relations that exist among smells and provide tips on how they could be exploited to improve the detection of other smells.
Numerous research papers empirically evaluated the effects related to the presence of smells. Le et al. [
17] presented an empirical study to date of architectural decay and its impact on software systems. For each version of the system, different architectural recovery techniques have been applied considering different types of smells. They examined the relationships between the smells collected and about 42.00 issues extracted from the repositories of the various systems in question. This has shown how architectural decay can cause significant problems for each software system. Mo et al. [
27] presented an empirical study of hotspot patterns that cause high maintenance costs. The aim of the study shows that these patterns not only identify the most error-prone and change-prone files, but also the root causes of bug-proneness and change-proneness in specific architecture problems. Tufano et al. [
12] presented an empirical investigation into when and why code smells are introduced in software projects. The study conducted over the commit history of open source projects demonstrated that most of the time the smells identified since their creation.
In [
15], the authors analyzed the magnitude and effects of smell co-occurrence, i.e., the co-occurrence of different types of smells on the same code component. They observed that some code smells frequently co-occur and that method-level code smells may be the root cause for the introduction of class-level smells.
In [
28], it is empirically validated the frequent collocations of 14 code smells detected in numerous Java open source systems. The authors highlighted as smell collocations lead to recurring patterns that could help to prioritize the classes to be refactored.
The presence of code smells has been investigated also in industrial projects [
29]. The study reports an empirical evaluation of inter-smell relations by analyzing larger systems, and by including both industrial and open-source ones. Palomba et al. [
15] reported a large-scale empirical investigation on the diffuseness of code smells and their impact on code change- and fault-proneness, showing as smells characterized by long and/or complex code are highly diffused, and that smelly classes have a higher change- and fault-proneness than smell-free classes.
Table 1 summarizes the main differences that emerged from the literature analysis.
All of these studies were conducted at a significantly high-level scope than our work, comparing the different releases of the analyzed software systems. The main contributes respect to the existing literature is:
A fine-grained analysis, at commits level, that is, the smells detection and the refactoring detection have been performed for each commit;
Focus on design smells, indeed, the large part of existing papers deals with the analysis of code smells;
Analysis of design smells removal and addition for each commit;
Relationships with refactoring activities and the presence of issues.
4. Empirical Study Design
Three research questions, concerning different aspects of the design smells evolution in a software system, are formulated for being addressed in the presented study. As previously stated, the study focuses on the presence of different types of smells in software design and the related evolution of the software components. Thus, the goal of the study is to analyze the evolution of design smells in open source software projects, to understand how they are related to maintenance activities, and specifically to refactoring and the presence of issues. Then, the study analyzes all the history of the software system considered, commit by commit, and, in particular, the research questions are the following:
4.2. Data Extraction and Analysis Methodology
The proposed study reports results involving eight Java open-source projects. These systems have been considered since these systems satisfy the following criteria: the programming language is Java; the Git repository is active and there is more than one release, there are variations in application domains, sizes, revisions, and the systems are also used by other studies.
Specifically, Log4j is a Java library used for managing logs in the Java environment; JavaAssist is a Java library used to modify Java bytecode allowing you to change the implementation of a class at run time; Guice is an open-source software framework that provides dependency injection support using annotations to configure Java objects; Junit4 is a framework used for writing repeatable tests; Atlas is a framework consisting of an extensible set of basic government services that allow companies to operate effectively thus allowing integration with the entire ecosystem of corporate data; Commons Digester allows the configuration of XML modules; Zookeeper provides an open-source distributed configuration service, a synchronization service for large distributed systems; finally Commons Net is a library that allows the implementation of the client-side of many basic Internet protocols, guaranteeing access to the fundamental protocol.
Table 2 shows the characteristics of the selected projects that are, the number of the total commits analyzed for projects, and the first and last commit date.
Figure 1 depicts the toolchain used for gathering the data required for conducting the analysis. The first step was the extraction of the change history of the software systems considering each commit. Specifically, all the commits log messages have been analyzed and parsed to extract all the changes performed on the files of the code repository. All the data obtained have been organized and stored in a data set.
At the same time, the source code at each commit has been downloaded and analyzed for understanding its evolution over time, commits per commits.
This means that, at each commit of the system, the source code has been analyzed as follows:
Detection of the design smells—to this aim the Designite tool has been used; Designite is a software design quality assessment tool. It supports comprehensive design smells detection but also provides a detailed metrics analysis. Further, it offers various features to help identify issues contributing to design debt and improve the design quality of the analyzed software system.
Table 3 reports the design smells detectable by Designite [
19];
Detection refactoring actions—to this aim the Refactoring Miner tool has been used, in particular, version 1.0; Refactoring Miner [
30] is an open-source tool that classifies the different refactorings in the history of Java projects. Refactoring Miner takes as an input the list of commits and returns a list of refactoring operations applied between consecutive commits. Refactoring Miner can detect 15 types of refactorings from different types of code elements (see
Table 4).
The output of these two steps is made up of two datasets. The first contains all the information obtained from the Designite tool, the name of the project analyzed, the commit id, the author, the date of the commit, the file, and the type of design smell identified.
The second, on the other hand, contains all the information about refactoring obtained by using the Refactoring Miner tool. In particular, the name of the project, the commit id, the analyzed file, the presence or absence of refactoring activities, and possibly the type of refactoring found.
As shown in
Figure 1, these datasets have been merged, obtaining an integrated dataset, containing all the data required for analysis. The obtained dataset contains for each commit, and for each file of that commit analyzed, all the information on the smells and refactoring previously collected. To check if there was an addition or removal of smell in that specific commit file, the algorithm compares the information of that file in the commit before the one under consideration. If one or more smells were present in the previous commit, and in the next one these smells are no longer present then in the dataset, than it is labelled as removed; if on the contrary, if the previous commit the smells detected by the tool was not present, than it is labelled as added; finally, if the information relating to a file between two successive commits does not change, it is labelled as no change.
Moreover, a qualitative analysis was carried out to carry out a more in-depth investigation to understand if the refactoring actions contribute to the removal of the smells.
In detail, considering all the systems, starting from the initial dataset, a significant sample of files (80) was selected, in which there was the co-occurrence between the removal of smell and the presence of refactoring. Manually, to understand if there was an explicit reference to the refactoring actions, each file was inspected by two authors, who assessed the presence of refactoring where the removal had occurred, by analyzing the text of the commit message, and inspection of the source code. Just in the 13% of cases, the authors agreed on the presence of the refactoring action to support the removal of the smell, thus determining a Cohen’s k inter-rater is equal to 0.7, which is a strong Cohen agreement. Subsequently, the authors resolved the discrepancies, discussing cases on which they had not agreed.
5. Results
This section reports the analysis of the results of our experiment in the face of the question raised during the experimental design. The main goal of this study is to better understand what happens to the refactoring related activities of a system if design smells are detected. Then, the first step is to quantify the number of design smells identified in the various classes and how many of these are subject to change (RQ1). Therefore, to understand how many of these smells are removed during the evolution of the system (RQ2) and what relationships can have the refactoring actions in this change (RQ3).
5.1. RQ1—To What Extent Are Design Smells Subject to Change?
Figure 2 reports for each system, respectively for Log4j, JavaAssist, Guice, JUnit4, Atlas, Commons-Digester, Zookeeper, and Commons-Net the percentage of classes modified that have at least one smell and the percentage of clean classes. The figure shows how the large part of these files changed has at least one smell, suggesting that developers always change classes that have at least one smells.
Figure 3 depicts the bar chart representing the number changes on classes where design smells are detected for each software system analyzed. In particular, it represents the number of classes with 0, 1, 2, 3 or 4 smells detected. It is possible to observe that the distribution is the same, which is the number of classes decreases when the number of smells increases.
These data confirm the idea that few classes are responsible for the large part of design violations. To investigate more in detail this aspect
Figure 3 reports the changes performed on classes that are affected by more than one smell.
Figure 4 reports, for each system, the percentage of classes affected by changes, in respect to the all classes, distributed among the different types of smells. Specifically, it can be observed that in all the systems, the classes most frequently changed are the ones affected by the following smells: unutilized abstraction, deficient encapsulation, cyclic-dependent modularization, insufficient modularization, and broken hierarchy. This result manifests an improper use of inheritance in the analyzed software and reports the evidence that when it occurs major negative effects can be observed during the maintenance activities.
Specifically,
Figure 3 depicts according to the smelly classes more frequently changed the number of co-occurring smells on those classes, distinguishing the different types.
The analysis of this figure shows that some smells pairs are more frequently co-occurring in the classes subject to changes. The analysis of the figure confirms that the classes mainly subject to changes are affected by the following type of smell: unutilized abstraction, deficient encapsulation, cyclic-dependent modularization, insufficient modularization and broken hierarchy. This reveals that when modularization problems occur than significant difficulties emerge about the architecture of the software system, indeed, cycle dependency modularization smells are also detected. This can lead to an increase of severity in the maintenance tasks. The data have been statistically analyzed using the Spearman rho, which is a rank-based correlation coefficient non-parametric.
In detail, in Log4j there is a high number of changed classes where both the unutilized abstraction e deficient encapsulation smells are detected, 666 classes, (p-value = 0.006). Similarly, the changed classes where the cyclic-dependent modularization and insufficient modularization, and unutilized abstraction and broken hierarchy are respectively 446 and 385 (p-value = 0.000). In the case of JavaAssist, the number of changed classes is higher, in particular, 1028 changes have been performed on classes affected by both the cyclic-dependent modularization and insufficient modularization smells (p-value = 0.000).
Nevertheless, there is a relevant number of changes even in classes where the pair (deficient encapsulation, cyclic-dependent modularization, p-value = 0.000) and (deficient encapsulation, insufficient modularization, p-value = 0.000) smells are detected, respectively 843 and 738. For the Guice system, the classes more changed are the ones with the pair (cyclic-dependent modularization and insufficient modularization, p-value = 0.000) of smells, i.e., 707 changes. This result appears particularly relevant as the successive values are equals to 205 and refer to the pair of smells (cyclic-dependent modularization, deficient encapsulation, p-value = 0.000).
In the case of JUnit4, it is possible to observe that the classes more frequently changed are the ones affected by the unutilized abstraction smell. In particular, classes affected by both unutilized abstraction smell and deficient encapsulation (p-value = 0.000), insufficient modularization (p-value = 0.000), broken hierarchy (p-value = 0.043), smells exhibit a higher number of changes. A very similar case occurs for the Atlas system.
Figure 5 reports that there is a relevant number of classes subject to frequent changes that are affected by more smells, that are: unutilized abstraction, deficient encapsulation, cyclic-dependent modularization, insufficient modularization, and broken hierarchy (
p-values = 0.000).
In detail, 987 classes affected by deficient encapsulation and insufficient modularization are subject to change, similarly, 726 and 694 are the classes affected by deficient encapsulation and cyclic-dependent modularization, and, cyclic-dependent modularization and insufficient modularization.
Results for Commons-Digester evidenced that classes with the unutilized abstraction smell detected are frequently committed, and, more often when even the deficient encapsulation and the broken hierarchy smells are detected, involving respectively 221 and 404 classes. Moreover, 226 changes have been performed on classes where both the insufficient modularization and cyclic-dependent modularization smells were detected. In the case of Zookeeper, it emerges a relevant number of changes for classes the following smells are detected: deficient encapsulation, cyclic-dependent modularization, and insufficient modularization, respectively 1077, 1024 and 1126.
Finally, the results obtained for Commons-net point out that a relevant number of changes are performed on classes where the following smells are detected: unutilized abstraction, deficient encapsulation, insufficient modularization and broken hierarchy with 830, 679, 350, and 348 changes.
Summary for RQ1. Overall, it can be observed that in all the systems, the classes most frequently changed are affected by the following smells: unutilized abstraction, deficient encapsulation, cyclic-dependent modularization, insufficient modularization, and broken hierarchy.
These data confirm the idea that few classes are responsible for the large part of design violations.
Specifically, it can be observed that in all the systems, the classes most frequently changed are those where the following smells are co-occurring: unutilized abstraction, deficient encapsulation, cyclic-dependent modularization, insufficient modularization, and broken hierarchy. This suggests maintainers the need for continuous monitoring of the presence of multiple design smells in a software system in order to avoid a critical increase in change activities.
5.2. RQ2—To What Extent Are Design Smells Removed during the System’s Evolution?
To investigate in more detail, how design smells are removed during the evolution of the system, an analysis has been performed to understand when smells are added and/or removed during the history of a software system, and in case if this addition or removal involves more than one smell at the same time.
Table 5 and
Table 6 report, that for each type of smell, the percentage of smells added, and the percentage of smells removed in each class, during the evolution of the system, detecting them commit by commit.
From the analysis of the data, it is possible to observe that numerous design smells are removed and added during the history of the software system, especially in classes with more frequently detected smells.
In particular, the percentage of smells removed is greater than the reintroduced one, suggesting the hypothesis that the changes in the various classes lead to improvements.
To investigate the relationship among the different design smells,
Figure 6 reports the group of smells that are co-removed from a class, i.e., in the same commit. The graphs of commons digester and commons net are not included as for these systems there are not co-removed smells.
As already emerged in RQ1, the classes more subject to removal are the ones affected by the following smells: unutilized abstraction, deficient encapsulation, cyclic-dependent modularization, insufficient modularization, and broken hierarchy.
In the case of Log4j, it emerges that unutilized abstraction is the smell more frequently co-removed with other smells. Indeed, it is co-removed with the deficient encapsulation, insufficient modularization, and broken hierarchy design smell.
The results of JavaAssist highlight that there is a limited number of classes with smell removal. Indeed, from
Figure 6 it is possible to observe that the values of removal for the different types of smells are often closed to zero.
In the case of Guice, there are several smells co-removal, in particular with the insufficient modularization smell. It is often removed with the unutilized abstraction, deficient encapsulation, and cyclic-dependent modularization smell.
Junit4 exhibits pics of co-removed smells in few cases, while the large part of removals is limited and closed to zero. Nevertheless, among the smells co-removed emerge the unutilized abstraction with the deficient encapsulation and broken hierarchy smells.
In the case of Atlas, the most frequently co-removed smell is the insufficient modularization. Indeed, it is often removed with the unutilized abstraction, deficient encapsulation, and cyclic-dependent modularization smell.
Common Digester’s results highlight that there are very limited cases of smell removals, in this case, the large part of values are equal to zero.
Figure 6 confirms also for Zookeeper that the smell pairs co-removed are: unutilized abstraction, deficient encapsulation, insufficient modularization, and broken hierarchy. In particular, unutilized abstraction and broken hierarchy are the design smells more frequently co-removed.
In Common Net, there are very few smells removed. Indeed, the corresponding values are often equal to zero, except for the unutilized abstraction design smell.
Summary for RQ2. Overall, it can be observed that in all the systems, the most frequently removed smells are unexploited encapsulation, unutilized abstraction, deficient encapsulation, unnecessary abstraction, and broken hierarchy. All these smells deal with class responsibilities.
As explained in the discussion section, it has been observed that design smells are removed due to a restructuring of the architecture of the software project. This means that more activities are contextual, such as code commented, code replaced, new code introduced, and some refactoring performed.
Instead, the smells more frequently introduced are multifaceted abstraction and unnecessary abstraction. Both these smells generally occur when class responsibilities are not adequately designed, that is a class assumes multiple or not needed responsibilities. Here the use “extract class” refactoring could help developers to improve the current design.
Specifically, it can be observed that in all the systems, the most frequently co-removed smells are the following: unutilized abstraction, deficient encapsulation, insufficient modularization, and broken hierarchy. In particular, unutilized abstraction and broken hierarchy are the design smells more frequently co-removed.
5.3. RQ3—To What Extent Can Smell Removal Be Related to Refactorings or Issues?
While results of RQ2 indicate that smells removals highly co-occur for a different type, such results do not tell yet whether refactorings contribute to these removals.
Table 7 reports the percentage of those refactoring that involved classes with smell removals. As the table shows, the number of cases in which refactorings actually occur on the smells-removed source code ranges between 3 of Common Digester and 123 of Atlas. The latter is also the case with the majority of refactorings removals (especially concerning the removal of unutilized abstraction).
Overall, it can be observed from the table that the large part of smell removals does not co-occur with refactoring in the same class. More in detail, it can be noted that in Log4j, just the 3% deficient encapsulation smells are removed in co-occurrence with refactoring.
Similarly, 7% of the 10% of unutilized abstraction and insufficient modularization smells were removed in co-occurrence with refactorings. In the case of JavaAssist, a relevant number of unutilized abstraction smells are removed in the presence of refactorings. From the analysis of Guice smells emerged that the 8% and 11% of respectively broken hierarchy and cyclic-dependent modularization smells are removed in the same classes where refactoring were detected.
In JUnit4 it can be noted that the large part of smells removed belongs to unutilized abstraction, deficient encapsulation, cyclic-dependent modularization, insufficient modularization, and broken hierarchy smell types. In particular, the co-occurrence of refactorings is low for all the types of smell removed. In Atlas, the smells mainly removed are as follows: unutilized abstraction, deficient encapsulation and broken hierarchy, with about 10% of co-occurrence of refactorings.
In Common Digester, it is possible to observe that the co-occurrence with refactoring is near to zero. In Zookeeper, smells mainly removed are unutilized abstraction, insufficient modularization, and broken hierarchy with about 10% co-occurrence of refactorings.
From a statistical perspective, we have checked whether, in a commit that involves a smell removal, refactorings are more likely to occur on the smell related code than in another source code.
Results of the Fisher’s exact test reported in
Table 8, indicate that there is no statistically significant evidence for all the systems analyzed.
To get a further understanding of these data they have also been related to the presence of issues. As reported in
Table 9, the occurrences of smell removal have been measured respect the co-occurring issue resolution. It is possible to observe the there are few relevant cases, and concerns the unutilized abstraction, deficient encapsulation, cycle dependent modularization and insufficient modularization smell removal, while in all the other cases the removal of the smell is not related to an issue treatment. This can suggest that the developer in these last cases specifically focused their activities on the smell removal.
Summary for RQ3. Overall, it can be observed that there is not co-occurrence of design smells removal and refactorings. This suggests that in all the systems analyzed the large part of smells removal was performed without any relationship with refactorings.
5.4. Qualitative Analysis
To understand what types of changes were made when a particular type of smell has been removed, more specifically when this has occurred with the presence of refactoring, a qualitative analysis was conducted to analyze the commit messages and related changes made in the source code.
This analysis aims to understand if the removal of smells and refactoring are linked by a cause–effect relationship, or if they are simply two activities that occur in a completely random way in the context of improving the quality of the code.
Analyzing the commit messages of the files in which the removal of the smells and the refactoring had been found, it became evident that there is never an explicit reference to the removal of the smell. On the contrary, references have often been found to the refactoring activities carried out, through the use of words, such as “refactoring”, “restructuring”, or “general reorganization”. Following this analysis, it is possible to say that it could be a few cases, in which, refactoring could have contributed to the removal of the smell.
To report some examples where refactoring occurred along with the removal of the smell, there is a commit in Log4j (c6e0193f8a5a9d84030d8e8f65fcf4cb18f5d0e), where the author explicitly states: “/apache/log4j/test/TP.java is not needed”, as confirmation of the removal of the smell unutilized abstraction.
In addition, in Log4j, there is a commit (da37a1de7a89a586c43fe03aaead20642dda8dcf) that reports the removal of a smell deficient encapsulation, and in the commit message, it reads “Updated Category.java to use ClassLoader.getSystemResource in its static initializer. This is much better than the silly stuff we did before.”
Looking also at the source code, it would seem that the mentioned class is removed, and its operations are inserted in an already existing class, therefore the accessibility to these operations changes.
In JavaAssist, a commit (42e1dbed4e870e5c452fa5173ac10921c46d06d0) mentioned “implemented javassist. bytecode. stackmap package”. Here the removal of an insufficient modularization smell was found, in fact, in the source code there is a decomposition because the methods are redistributed among the classes, it could, therefore, be a refactoring that contributed to the removal of the smell.
In the Commons-Net system, there is a commit (88a631049caa76e8433dd0fcd7f3f97d4c93e383) where the smell unutilized abstraction has been removed and the commit message reads precisely: “removal of recently added files no longer needed because of most recent simplification.”
In many cases such as Atlas commit (22624786ee4fe86c94d94fee4bcf4c0855919901), there is the removal of the unutilized abstraction smell, the commit message reports “removed un-used modules” and the class is completely removed as if to confirm the removal of that particular smell.
Considering that few cases have been found in which it could be thought that refactoring contributed to the removal of the smell, it is possible to say that these two activities are carried out together by chance. Finally, we can speak of a simple reorganization of the code architecture and system components.