This section presents an empirical evaluation of the proposed approach based on Blobs taken from two real systems.
4.1. Research Questions and Planning
The purpose of this evaluation is to show that applying the proposed approach to real Blobs will extract from the Blobs classes with better overall quality and to show that the proposed approach (which considers both cohesion and coupling) can suggest refactoring solutions with better overall quality than the solutions suggested by considering only the cohesion. For this purpose, the following two research questions are defined:
RQ1. When applied on real Blobs, does the proposed approach extract classes with better overall quality than the quality of the Blobs?
RQ2. Can the refactoring solutions suggested by considering both the cohesion and coupling have better overall than the refactoring solutions suggested by considering only the cohesion?
To answer RQ1, the proposed approach is first applied on each considered Blob. Then the combined value for the cohesion and coupling for the set of classes extracted by the proposed approach (i.e.,
value, see Equation (
4)) is compared with the
of the Blobs where
for a Blob is equal to the cohesion of the Blob according to Equation (
4) as the combined value for the cohesion and coupling for a set of classes that has only one class (which is the Blob in our case) equals to the cohesion of the class. It is expected that the proposed approach will extract classes with higher
values compared to the
values of the input Blobs.
To answer RQ2, the
value of the extracted classes suggested by the proposed approach is compared with the
value of the extracted classes suggested by a variation of the proposed approach that only considers the cohesion when choosing the set of the classes to be extracted from a Blob and when merging the small classes. Specifically, the variation of the proposed approach results from altering Algorithms 1 and 3, such that the average cohesion (see Equation (
2)) of the extracted classes is used instead of the
value of the extracted classes (see Algorithms 4 and 5). Algorithm 2 is not altered in the variation approach because
is not used in the algorithm.
Algorithm 4: variationOfExtractClassRefactoring(A) //a variation of Algorithm 1. It considers the average cohesion instead of |
|
Algorithm 5: variationOfMergeSmallExtractedClasses(S) // a variation of Algorithm 3. It considers the average cohesion instead of . |
|
4.4. Results and Discussion
The proposed approach and its variation were applied on the considered Blobs. Constructors, setters, and getters were excluded from the Blobs before the application of the proposed approach. Constructors are special methods used to instantiate objects from classes, and they were removed because they usually use all the attributes of the class which means they will have a dependency with each method that accesses an attribute in the class. Similarly, the setters and getters are special methods in the class used to change and get the values of the attributes. They were removed from the Blobs because usually each setter and getter accesses only one attribute in the class meaning that they will not have a dependency with other setters and getters in the class. In addition, methods that do not access any attribute in the class were removed because they will have no dependency with other methods in the class. These methods were removed because the proposed approach conducts the refactoring based on the cohesion and coupling which are calculated in the approach based on one kind of dependency between the methods of the class (i.e., the structural dependency resulted from accessing common attributes, see Equation (
1)). The setters and getters constitute most of the removed methods. Direct and indirect access to attributes by methods were considered when identifying the set of attributes accessed by the methods of the class. A method directly accesses an attribute if the attribute appears in the body of the method (i.e., the piece of code that implements the method). On the other hand, a method indirectly accesses an attribute if the attribute appears in the body of another method that is directly or indirectly called by the method.
Table 4 and
Table 5 show the number considered methods (NCM) and the
value of each Blob before refactoring where the
value for one class is equal to the cohesion of the class. In addition, the two tables show the number of extracted classes (NEC), the number of methods (NM) in each extracted class, the
value, and the average cohesion (Ave. Cohesion) of the extracted classes after the application of the proposed ECR approach. For instance, results in
Table 4 show that NCM in the Blob DeferredDocumentImpl is 34 and the
value of the Blob before refactoring is 0.44. The proposed approach suggests 3 classes to be extracted from the Blob DeferredDocumentImpl. The number of methods (NM) in the first, second, and third extracted classes are 27, 4, and 3, respectively. The number of extracted classes suggested by the proposed approach varies from one Blob to another. For some Blobs (e.g., XIncludeHandler), the number of suggested extracted classes is high compared to the other Blobs. The reasons behind this are that these Blobs have high NCM and low cohesion because most of their considered methods have 0 or low dependency between each other.
Answering RQ1: The results reported in
Table 4 and
Table 5 show that the
value (which reflects the overall quality in terms of cohesion and coupling) of the extracted classes are higher than the
value of the Blobs. In addition, the average cohesion of the extracted classes is higher than the cohesion of the original Blobs. Therefore, it can be stated that the proposed approach extract classes from real Blobs with better overall quality than the quality of the Blobs.
Answering RQ2:Table 6 and
Table 7 show the refactoring solutions resulting from the application of the variation approach (that considers only the cohesion) on the considered Blobs. As it can be seen, the refactoring solutions in
Table 6 and
Table 7 for some Blobs are different than the refactoring solutions suggested by the proposed approach (given in
Table 4 and
Table 5). For a case, the proposed approach suggests partitioning the Blob DeferredDocumentImpl into three classes (i.e., the extracted classes) whereas the variation approach suggests partitioning the Blob into 4 classes. Although the average cohesion of the extracted classes suggested by the variation approach is higher in the case of DeferredDocumentImpl than the average cohesion of the extracted classes suggested by the proposed approach, the
value (which indicates the overall quality in terms of cohesion and coupling) of the extracted classes suggested by the variation approach is lower than the
value of the extracted classes suggested by the proposed approach. Thus, it can be claimed that the extracted classes suggested by the proposed approach for the Blob DeferredDocumentImpl have better overall quality than the classes suggested by the variation approach. In other cases, (e.g., the case of XIncludeHandler and GanttProject) the extracted classes suggested by the proposed approach have higher average cohesion than the extracted classes suggested by the variation approach. Most importantly, when comparing all the refactoring solutions of proposed approach that are different than the refactoring solutions of the variation approach, the
values of the extracted classes suggested by the proposed approach are higher than the
values of the extracted classes suggested by the variation approach. Based on these observations, it can be stated that considering both the cohesion and coupling during ECR can extract classes with better overall quality than considering only the cohesion.
Comparing the results with literature:Table 8 and
Table 9 compare the results of the proposed approach with the results published in [
12] based on the number NEC and the average value of LCOM2 of the extracted classes for the considered Blobs from Xerces2 and GanttProject, respectively. The reason why the results of this study are compared with the results in [
12] is that the Blobs used in this study were also used in the empirical evaluation in [
12]. In addition, the study in [
12] is well-known in the literature and is highly cited. LCOM2 is an inverse cohesion metric that measures the lack of cohesion for a given class. It is calculated by subtracting the number of method pairs in the class that does not share any attribute from the number of method pairs that share at least one attribute. If the result of the subtraction is negative, then the value of LCOM2 is set to 0. Thus, the value of LCOM2 ranges from 0 to
where a smaller value of LCOM2 indicates better quality in terms of cohesion and vice versa. LCOM2 was selected as a criterion of comparison because the LCOM2 was reported for the extracted classes suggested by the approach in [
12]. In addition, the metric has been used as a quality indicator in many empirical studies in the literature. The Python tool (see
Section 4.3) that implements the proposed ECR approach was extended to calculate the average of LCOM2 of the extracted classes suggested by the proposed approach. The average of LCOM2 of the extracted classes suggested by the approach in [
12] was calculated manually based on results reported in [
12]. For example, the average of LCOM2 of the extracted classes suggested by the approach in [
12] from the Blob DeferredDocumentImpl is 20.5 because the approach extracted two classes from the Blob with LCOM2 values of 0 and 41 as reported in [
12]. It can be seen from the results given in
Table 8 and
Table 9 that the proposed approach suggests to extract higher number of classes that have smaller average of LCOM2 compared to the approach in [
12]. There are two main reasons behind this. First, the proposed approach considers only one type of dependency between the methods which is the structural dependency (see Equation (
1)) that exists between two methods when they share common attributes. The same type of dependency is used in LCOM2 as the metric calculates the cohesion of a class based on the dependency resulting from sharing at least one common attribute between the methods of the class. On the other hand, the approach in [
12] considers structural and semantic dependency between the methods of the class. Thus, the approach in [
12] may suggest extracting a class with methods that are semantically dependent on each other but they do not share any attribute. Such a class would have a high value of LCOM2 and would be considered poorly cohesive when evaluated using LCOM2 as a quality indicator. The second reason is that the proposed approach excludes the setters and getters, and the methods that do not access any attributes in the class which was not excluded in the approach in [
12]. LCOM2 is badly affected by these methods because they usually will not share attributes with most of the methods in the class which will lead to higher values of the metric. A further note regarding the semantic dependency, it is challenging to calculate the semantic dependency between the methods of the class automatically. It was calculated in the approach in [
12] using the Latent Semantic Indexing based on the text-similarity of the methods. A major drawback of this approach is that the semantic similarity or dependency between the methods is greatly affected by the volume of the text (e.g., comments) in the methods and by the naming convention which varies from one programmer to another. This drawback may lead to poor refactoring solutions when considering the semantic dependency in ECR.
4.5. Threats to Validity
Several issues may pose threats to the validity of this empirical evaluation and limit the generality of its reported results. The first issue is that the sets of attributes accessed by methods were extracted automatically from the source code of the considered Blobs using a Java tool develop based on JavaParser [
47]. These sets were not manually validated for all the methods of each Blob. However, two methods from each Blob were randomly selected, and the sets of their accessed attributes extracted by the Java tool were manually validated and they were found to be correct. In addition, the Java tool used in this study is an extension of a tool that was developed in [
41] and extensively tested to automatically calculate a set of cohesion metrics from the source code of a set of Java projects consisting of a large number of classes.
The second issue that may cause a threat to the validity of this empirical evaluation is the removal of the methods that do not access any of the attributes of the Blobs. Although it might seem to be a bad design, a class can have methods that implement some features of the class without accessing the local attributes of the class such as the methods that implement user interface functionality [
48]. Nevertheless, including them in this study could have affected the results of the proposed approach because the cohesion and coupling were measured in the proposed approach based on the dependency resulting from sharing common attributes by the methods of the class. Each of these methods would end up alone in a separate extracted class after the execution of Algorithm1 1. Then they would be merged with other extracted classes with which they have 0 dependencies after the execution of Algorithm 3 to avoid the extraction of small classes which would decrease the average cohesion and the overall quality of the extracted classes. This issue can be mitigated by considering conceptual (or semantic) dependency between the methods of the class (which is out of the scope of this study) besides the dependency considered in the proposed approach.
The last issue is that the refactoring solutions suggested by the proposed approach and its variation were not evaluated by software practitioners. Although the extracted classes from the considered Blobs were evaluated based on the cohesion and coupling and were shown to have better overall qualities than the original Blobs, these refactoring solutions might not be useful to some software practitioners. However, the literature has proven the importance of the cohesion and coupling and how they have an impact on other quality characteristics that are important in the software industry such as maintainability and testability. In addition, many of the refactoring approaches that were previously introduced in the literature have been evaluated using cohesion and coupling metrics.