A Notional Understanding of the Relationship between Code Readability and Software Complexity
Abstract
:1. Introduction
- Investigate the relationship between code readability and software complexity and use an approach for constructing automatic tools using machine learning and various readability features.
- Provide a comprehensive analysis of various classifiers, including decision tree, naïve Bayes, Bayesian network, neural network, and SVM classifiers, with several measures to provide the flexibility to select the classifier whose accuracy specifications are most relevant to users.
- Investigate the effectiveness of 24 readability features on classifier performance.
- Apply a variety of complexity metrics to gain a better inspection of the software complexity, including Chidamber and Kemerer’s metrics (WMC, RFC, DIT, and LCOM), which are considered as object-oriented metrics, Lorenz and Kidd’s metric (OSavg) for operation-oriented metrics, lines-of-code metric (LOC), and Halstead’s metric.
2. Literature Review
2.1. Readability Metrics
2.2. Complexity Metrics
- Weighted methods per classWeighted methods per class (WMC) is the summation of all method’s complexities for a single class; the complexity of the method is computed using the cyclomatic complexity. Thus, the complexity of the class increases when the number of methods and their complexity are increased; hence, the WMC measure should be kept as low as possible.
- Depth of inheritance treeThe depth of the inheritance tree (DIT) is the max length from the root to the lowest class. When the value of the DIT increases, this indicates that the lower classes will inherit many methods and properties, so the class behavior will be difficult to predict, leading to design complexity. On the other hand, the DIT refers to the reusability of code which has a positive impact on the code. However, there is no standard range that can be used as an accepted value for the DIT.
- Number of direct childrenThe number of direct children (NOC) indicates the number of subclasses that are immediately inherited from the class. When the NOC increases, the reusability of the code increases, but the amount of testing will also increase [17].
- Coupling between objectsThe coupling between objects (CBO) refers to the number of couplings between objects in the same class. A high CBO value indicates complicated modifications and testing for the class, so the CBO value should be as low as possible.
- Response for a classThe response for a class (RFC) metric is the set of methods that will be executed when responding to an object message. When the RFC value is high, the testing effort will increase because of the test sequence [17], thus increasing the design complexity for the class.
- Lack of cohesion in object methodsThe lack of cohesion in object methods (LCOM) metric indicates methods in one class that access the same attributes, so if the LCOM is high, this indicates a higher complexity of the class design. Therefore, it is better to keep the LCOM value low.
- Average operation size (OSavg).
- Operation complexity (OC).
- Average number of parameters per operation (NPavg).
2.3. Software Complexity and Readability
3. Study Methodology
3.1. Source Code Projects
3.2. Software Metrics and Tools Selection
3.2.1. Readability Metrics Selection
- Average line length (number of characters);
- Maximum line length (number of characters);
- Average number of identifiers;
- Maximum number of identifiers;
- Average identifier length;
- Maximum identifier length;
- Average indentation (preceding whitespace);
- Maximum indentation (preceding whitespace);
- Average number of keywords;
- Maximum number of keywords;
- Average number of numbers;
- Maximum number of numbers;
- Average number of comments;
- Average number of periods;
- Average number of commas (,);
- Average number of spaces;
- Average number of parentheses and braces;
- Average number of arithmetic operators (+, −, *, /, %);
- Average number of comparison operators (<, >, ==);
- Average number of assignments (=);
- Average number of branches (if);
- Average number of loops (for, while);
- Average number of blank lines;
- Maximum number of occurrences of any single character;
- Maximum number of occurrences of any single identifier.
3.2.2. Complexity Metrics Selection
- Chidamber and Kemerer’s metrics suite:This is one of the most referenced metrics suites for object-oriented metrics [25]. It was proposed by Chidamber and Kemerer in 1994 and contains six metrics: WMC, DIT, NOC, CBO, RFC, and LCOM. From these metrics we chose the ones that were the most related to complexity; consequently, we chose:
- (a)
- WMC;
- (b)
- DIT;
- (c)
- LCOM;
- (d)
- RFC.
The selection was based on the metrics measurement, since they were the metrics most related to software complexity and measured it. The WMC metric measures the complexities for all the methods within a class, and the DIT, LCOM, and RFC metrics measure the design complexity of the class. - Lorenz and Kidd’s metrics:In 1999, Lorenz and Kidd proposed three simple metrics for operation-oriented metrics. These metrics were concerned with the complexity and size of operations. We selected the average operation size (OSavg) to be one of our complexity metrics. It computes the average complexity of operations using the following formula:WMC*: summation of all methods’ complexity for a given class.N: number of methods per class.It is better to keep the OSavg low, and classes with an OSavg value greater than 10 should be redesigned.
- Lines of code (LOC):The size of code is correlated to software complexity. The LOC metric is one of the easiest measures that are used to measure the productivity of the programmer and complexity of software [26]. This metric basically counts the lines of code of software including source code, whitespaces, and comments.
- Halstead’s complexity measure:This is a very common code complexity measure that was proposed by Halstead in 1977. It uses primitive measures to compute many metrics. We selected program volume to use in our investigation since it focuses on the program’s physical size, whereas the code complexity increases as the program volume increases.
3.2.3. Classification Algorithms
3.2.4. Statistical Techniques for Combining Metrics
- Clustering analysis:In this technique, we decided to group the metrics based on the clustering method. Clustering is an unsupervised learning approach that is used to group similar objects into the same cluster. Unlike supervised learning, the number of clusters or names is unknown. Thus, the clustering algorithm determines the best number of clusters to obtain based on some measures. To determine whether two objects are in the same cluster or not, it uses a distance measure such as the Euclidean distance, Manhattan distance, or others. The K-means algorithm works as follows:
- (a)
- Randomly select K points as centroid values for the classes.
- (b)
- Assign each object to the most similar (closet) centroid.
- (c)
- For each cluster, calculate the mean value, which is the new centroid value of the class.
- (d)
- Reassign each object to the closest centroid.
- (e)
- Repeat the last two steps (3 and 4) until no change.
- (f)
- Assign the class labels for the objects.
- Summation division:For this approach, we decided to combine the metrics into one variable based on the summation division. Let us assume that we have N metrics that we want to combine into one variable; in this approach, we follow these steps:
- (a)
- For each metric:
- Compute the mean.
- If the metric value is greater than the mean, set it to 1; if the value is equal to the mean, set it to 0; and if it is less than the mean, set it to −1. This step produces N columns. Each column represents a metric with only three possible values (1, 0, −1).
- (b)
- Add a new variable (i.e., called Metrics_Sum) that represents the summation of the N columns produced by the previous step, the Metrics_Sum column’s value ranges from N to −N.
- (c)
- Compute the median for the Metrics_Sum column.
- (d)
- If the Metrics_Sum is greater than the median, give it a class label of “High”, and if it is less than the median give it a class label of “Low”
This approach produces two class labels: “High” and “Low”. In step 4, when dividing the data into two intervals, we can use two statistical measures depending on the data distribution: median or mean. The mean is used when the data have a normal distribution, and the median is used if our data are not normally distributed. - PCA dimension reduction:In this approach, we combined the metrics into one variable using a principal component analysis as a dimension reduction technique. It is used for a linear transformation of the data to a lower dimension space as the variance of the lower data is maximized. For this technique, we applied the PCA for dimension reduction on the metrics, specifying only to produce one component which was the strongest one.
- Quartile analysis:Quartiles are used to divide the data into equal groups; they have three values: first quartile (Q1), second quartile (Q2), and third quartile (Q3). Q1 splits the lowest 25% of the data; Q2, known as the median, splits the data in half; and Q3 splits the lowest 75% of the data. We chose this method in order to combine the metrics into one variable. The steps for this approach were as follows (assuming N metrics):
- (a)
- For each metric:
- Compute the mean.
- If the metric value is greater than the mean, set it to 1; if the value equal the mean, set it to 0; and if it is less than the mean, set it to −1. This step produces N columns. Each column represents a complexity metric with only three values (1, 0, −1).
- (b)
- Add a new variable (e.g., called Metrics_Sum) that represents the summation of the N columns produced by the previous step, the Metrics_Sum column’s value ranges from N to −N.
- (c)
- Compute Q1, Q2, and Q3 for the Metrics_Sum column.
- (d)
- If the Metrics_Sum is greater than Q3, give it a class label of “High”; if it is between Q1 and Q3, give it a class label of “Medium”; and if it less than Q1, give it a class label “Low”.
4. Experiments and Results
4.1. Principal Component Analysis
- PC1: Max character occurrences, Max word, LOC, LCOM, RFC, WMC, and Program Volume.
- PC2: average parenthesis, average arithmetic, average assignment, average if, average comments, average indent, and max indent.
- PC3: average spaces, average comparisons, average blank lines, average comments, and average identifier length.
- PC4: max indents, max line length, and max numbers.
- PC5: average periods, average indents, and average line length.
- PC6: DIT, OSavg, RFC, and WMC.
- PC7: average keywords, and max keywords.
- PC8: average identifier length, average numbers, and max word length.
- PC2, PC3, PC4, PC5, PC7, and PC8 showed how the readability features captured the same orthogonal measure.
- PC6 contained some software complexity measures which were correlated with each other.
4.2. Case Study 1: Mapping from Readability to Complexity
4.2.1. Statistical Techniques
- Clustering analysis:In this technique, we decided to group the complexity metrics based on the clustering method. We used the K-means clustering algorithm and chose the Euclidean distance as the distance measure. Table 1 illustrates the experimental results obtained by applying the K-means clustering algorithm with different k values from two up to five. After looking at the values of the LOC, DIT, LCOM, OSavg, RFC, WMC, and program volume complexity metrics, we managed to give each class a complexity degree. However, when looking at the class distribution, we noticed that the data were not well distributed in both classes.From the previous experiments, we concluded that clustering was not a good technique to apply on our dataset, since the classes produced were not well distributed, even when selecting different K values.
- Summation division:For this approach, we decided to combine the seven complexity metrics into one variable that represented the overall complexity based on the summation division. In step 4, when dividing the data into two intervals, the median was used rather than other statistical measures since the data were not normally distributed and had a positively skewed distribution, where the mean was greater than the mode, and the skewness value was equal to 1.098.This approach produced two class labels, “High”, representing a high complexity, and “Low”, referring to a low complexity. From the class distribution for this approach, we noticed the data had an accepted distribution with 66% for the low complexity and 34% for the high complexity.
- PCA dimension reduction:In this approach, we combined the seven complexity metrics into one variable based on the PCA dimension reduction. For this experiment, we applied the PCA for dimension reduction on the seven complexity metrics specifying only to produce one component, which was the strongest one. The component produced had an eigenvalue equal to 4.165, which described the total variance explained. As shown in Table 2, all the complexity metrics in this component had values greater than 0.50, and the WMC had the highest value of 0.963. The varimax rotation matrix was used to get cleaner and better results.The component result was stored as a variable named Comp1. Then, for this variable, the median value was calculated. We selected the median rather than any other statistical measure since the data were not normally distributed and had a positively skewed distribution where the skewness value was equal to 9.130.The data were given two class labels: “High” for a variable “Comp1” that was above the median and “Low” if it was below the median. In this approach, the data were equally distributed into the two classes, 50% for the class “High” and 50% for the class “Low”.
- Quartile analysis:Quartiles were used to divide the data into three groups. They had three values: first quartile (Q1), second quartile (Q2), and third quartile (Q3). The classes’ distributions were class “High” with 26.66%, class “Medium” with 38.37%, and class “Low” with 34.98% after applying the quartile approach. Looking at the results, the classes were well distributed.
4.2.2. Statistical Techniques Selection
4.2.3. Results
- Summation division results:We used the complexity metric variable produced by the summation division approach as a dependent variable, and the 25 readability features as independent variables. We applied the five classifiers on the dataset. All these algorithms were implemented using the WEKA tool.For testing and evaluation, two techniques were used, namely, the percentage split and a cross-validation. In the percentage split test, the data were divided into 66% for training and 34% for testing, and for the cross-validation test 10 folds were used.The results are listed in Table 3 and Table 4 for the two testing techniques. The best accuracy was obtained for the percentage split validation was with the decision tree classifier with an accuracy of 90.07%, and the mean absolute error and F-measure values were 0.13 and 0.90, respectively. On the other hand, the best result obtained from the 10-fold cross validation was also with the decision tree with a 90.15% accuracy and mean absolute error and F-measure values of 0.12 and 0.90, respectively.
- PCA dimension reduction results:In this experiment, we used the five classifiers and applied them to our dataset. We used the complexity variable produced by the PCA dimension reduction approach as the dependent variable and the 25 readability features as independent variables. For testing and evaluation, two techniques were used, namely, the percentage split and cross-validation.The results are listed in Table 5 and Table 6. The best accuracy was obtained for the percentage split validation with the neural network classifier with an accuracy of 89.54% and a mean absolute error and F-measure of 0.16 and 0.90, respectively. On the other hand, the best result obtained from the 10-fold cross-validation was also with a neural network with a 89.23% accuracy and mean absolute error and F-measure values of 0.15 and 0.89, respectively.From this case study, we found that code readability had a great influence and impact on software complexity, which addressed the essential role that code readability plays on the maintainability process and on overall software quality. The experimental results confirmed the H0 hypothesis which claimed that code readability had an influence on software complexity and proved the invalidity of hypothesis H1 which claimed that code readability had no influence on software complexity. Moreover, the results indicated that code readability affected software complexity with a 90.15% accuracy using a decision tree classifier. Therefore, it was considered as a strong influence. Figure 2 presents the first three levels of the tree model. Due to the tree’s large size, we only show the first three levels; these levels show the most effective readability features in this relation which are: max character occurrences, average arithmetic, average parenthesis, max indents, and max indent (preceding white space).Consequently, readability is not a simple attribute that can be measured directly or can be reflected on one attribute. However, software developers must pay more attention and focus on this attribute in the development phase, as it affects other software quality attributes and the maintainability process.
4.3. Case Study 2: Mapping from Complexity to Readability
4.3.1. Statistical Techniques
- Summation division:For this approach, we decided to combine the 25 readability features into one variable that represented the overall readability, based on the summation division. Keeping in mind that there were some readability features with positive impact and others with negative impact, some steps from this approach were modified as follows:
- (a)
- For each readability metric:
- Compute the mean.
- If the feature has a positive impact then: if the readability value is greater than the mean, set it to 1; if the value is equal to the mean, set it to 0; and if it is less than the mean, set it to a value of −1
- Otherwise, if the feature has a negative impact, do the following step: if the readability value is greater than the mean, set it to −1; if the value is equal to the mean, set it to 0; and if it less than the mean, set it to 1
- (b)
- At the end of the previous step, 25 columns are produced, where each column represents a readability feature with only one of three possible values (1, 0, or −1).
- (c)
- Add a new variable (called Readability_Sum) that represents the summation of all 25 columns produced by the previous step, the Readability_Sum column’s value ranges from 25 to −25.
- (d)
- Compute the median for the Readability _Sum column.
- (e)
- If the Readability_Sum is greater than the median, give it a class label of “High”, and if it less than the median, give it a class label of “Low”.
This approach produced two class labels, “High” representing a high readability, and “Low” referring to a low readability. The median was used to divide the data into two intervals. It was used rather than any other statistical measure since the data were not normally distributed, where the skewness value equaled −0.265.Considering the class distribution for this approach, the data had an accepted distribution with 46.53% for the low readability and 53.47% for the high readability. - PCA dimension reduction:In this approach, we combined the 25 readability features into one variable that represented the overall readability based on the PCA dimension reduction. For this experiment, we applied the PCA on the 25 readability features, specifying only one component to be produced, which was the strongest component. The component produced had an eigenvalue equal to 5.578. The component contained all the readability features, and the highest feature value was the average indent with 0.797. The varimax rotation matrix was used to get cleaner and better results.When looking at the readability features, we noticed that there were some features that had negative values, which indicated that they were correlated with the component in a negative way. In contrast, the positive values referred to a positive correlation. The negatively correlated features were average spaces, average arithmetic, average blank lines, and average comments. These features had a positive impact on the readability, except for average spaces, which could be excluded since it was almost not correlated to the extracted component (a very low value close to zero). The other features were considered positive features.The component result was stored as a variable named Component_1. Then, for this variable, the median value was calculated. We selected the median rather than another statistical measure since the data were not normally distributed, and the skewness value equaled 1.117. The data were given two class labels: “Low” for the variable Component_1 value above the median and “High” if it was below the median, since the coefficient metrics in the component with positive values had a negative impact on the readability, while the coefficients with negative values had a positive impact. Considering the class distribution for this approach, the data were equally distributed into the two classes, 50% for class “High” and 50% for class “Low”.
4.3.2. Results
- Summation division results:We used the readability variable produced by the summation division approach as the dependent variable and the seven complexity metrics as independent variables. We applied the five classification algorithms on the dataset. All these algorithms were implemented in the WEKA tool.For testing and evaluation, two techniques were used, namely, percentage split and cross-validation. In the percentage split test, the data were divided into 66% for training and 34% for testing, and for the cross-validation test 10 folds were used. The results are listed in Table 7 and Table 8. The best accuracy was obtained for the percentage split validation with the decision tree classifier with an accuracy of 85.15% and mean absolute error and F-measure values of 0.20 and 0.85, respectively. On the other hand, the best result obtained from the 10-fold cross-validation was also with the decision tree with a 85.54% accuracy and mean absolute error and F-measure the values of 0.19 and 0.86, respectively.
- PCA dimension reduction results:In this experiment, we used the five classifiers and applied them on our dataset. We used the readability variable produced by the PCA dimension reduction approach as the dependent variable and the seven complexity metrics as independent variables. For testing and evaluation, two techniques were used, namely, percentage split and cross-validation.
4.4. Case Study 3: Investigating the Size Independence Claim
- Correlation analysis:We applied the correlation analysis to examine the correlation between the lines of code (LOC) and the 25 readability features using Spearman and Kendall’s tau-b correlation coefficients. The correlation tests confirmed that size was correlated with readability features. Some of them were strongly correlated, such as average arithmetic, average if, max word and max character occurrences. On the other hand, others were weakly correlated such as the average spaces and average commas.We then defined three relations between the LOC metric and readability features, which were a weak relation, a strong positive relation, and a strong negative relation. A weak relation referred to the readability features which were weakly correlated with the LOC metric, where the correlation was between 0.50 and −0.50. Therefore, we considered that these features had a weak relation with the LOC metric. A strong positive relation indicated that the readability metrics had a strong and positive correlation with the LOC metric, where the correlation was between 1.0 and 0.50. A strong negative relation referred to the readability metrics having a strong but negative correlation to the LOC metric and the correlation was between −0.50 and −1.0.Table 11 shows the relation between the LOC and readability metrics, where 14 features had a weak relation with the LOC metric, 10 features had a strong positive relation, and only one feature had a strong negative relation. The results indicated that many features had a relation with the LOC metric which could be either a positive or negative relation.
- Classification AlgorithmsSeveral experiments were held to investigate the previous hypotheses using two techniques, namely, summation division and PCA dimension reduction. We used the results obtained from Section 4.3.2, where these statistical analysis techniques were used to group the 25 readability features into one variable, because as mentioned earlier, the dependent variable should be only one variable. Then, we applied several classification algorithms.We applied the results obtained from the two approaches, summation division and PCA dimension reduction, on the classification algorithms. All the experiments confirmed the hypothesis H0, which indicated that code size had an influence on code readability features.
4.4.1. Summation Division Results
4.4.2. PCA Dimension Reduction Results
5. Comparison Studies
6. Threats to the Validity and Assumptions
7. Conclusions
8. Future Work
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Dubey, S.; Rana, A. Assessment of Maintainability Metrics of Object-Oriented Software System. ACM Sigsoft Softw. Eng. Notes 2011, 36, 1–7. [Google Scholar] [CrossRef]
- Aggarwal, K.K.; Singh, Y.; Chhabra, J.K. An Integrated Measure of Software Maintainability; Reliability and Maintainability Symposium: Seattle, WA, USA, 2002; pp. 235–241. [Google Scholar] [CrossRef]
- Raymond, D. Reading Source Code. In Proceedings of the 1991 Conference of the Centre for Advanced Studies on Collaborative Research, Toronto, ON, Canada, 28–30 October 1991; pp. 3–16. [Google Scholar] [CrossRef]
- Deimel, L. The Uses of Program Reading. ACM Sigcse Bull. 1985, 17, 5–14. [Google Scholar] [CrossRef]
- Rugaber, S. The Use of Domain Knowledge in Program Understanding. Ann. Softw. Eng. 2000, 9, 143–192. [Google Scholar] [CrossRef]
- Brooks, F. No Silver Bullet Essence and Accidents of Software Engineering. IEEE Comput. 1987, 20, 10–19. [Google Scholar] [CrossRef]
- Buse, R.; Weimer, W. Learning a Metric for Code Readability. IEEE Trans. Softw. Eng. 2010, 36, 546–558. [Google Scholar] [CrossRef]
- Goswami, P.; Kumar, P.; Nand, K. Evaluation of Complexity for Components in Component Based Software Engineering. Int. J. Res. Eng. Appl. Sci. 2012, 2, 902. [Google Scholar]
- Rudolph, F. A New Readability Yardstick. J. Appl. Psychol. 1948, 32, 221–233. [Google Scholar]
- Butler, S.; Wermelinger, M.; Yu, Y.; Sharp, H. Exploring the Influence of Identifier Names on Code Quality: An Empirical Study. In Proceedings of the 14th European Conference on Software Maintenance and Reengineering (CSMR), Madrid, Spain, 15–18 March 2010; IEEE Computer Society: Washington, DC, USA, 2010; pp. 156–165. [Google Scholar] [CrossRef] [Green Version]
- Tashtoush, Y.; Odat, z.; Alsmadi, I.; Yatim, Y. Impact of Programming Features on Code Readability. Int. J. Softw. Eng. Its Appl. 2013, 7, 441–458. [Google Scholar] [CrossRef] [Green Version]
- Tashtoush, Y.; Darweesh, D.; Albdarneh, M.; Alsmadi, I.; Alkhatib, K. A Business Classifier to Detect Readability Metrics on Software Games and Their Types. Int. J. Entrep. Innov. 2015, 4, 47–57. [Google Scholar] [CrossRef] [Green Version]
- Karanikiotis, T.; Papamichail, M.D.; Gonidelis, L.; Karatza, D.; Symeonidis, A.L. A Data-driven Methodology towards Interpreting Readability against Software Properties. In Proceedings of the 15th International Conference on Software Technologies, Paris, France, 7–9 January 2020; pp. 61–72. [Google Scholar] [CrossRef]
- Sarkar, B.; Takeyeva, D.; Guchhait, R.; Sarkar, M. Optimized radio-frequency identification system for different warehouse shapes. Knowl.-Based Syst. 2022, 258, 109811. [Google Scholar] [CrossRef]
- Sarkar, B.; Takeyeva, D.; Guchhait, R.; Sarkar, M. Mathematical estimation for maximum flow of goods within a cross-dock to reduce inventory. Math. Biosci. Eng. 2022, 19, 13710–13731. [Google Scholar]
- Chidamber, S.; Kemerer, C. A Metrics Suite for Object Oriented Design. IEEE Trans. Softw. Eng. 1994, 20, 476–493. [Google Scholar] [CrossRef] [Green Version]
- Pressman, R. Software Engineering: A Practitioner’s Approach, 6th ed.; McGraw-Hill Science: Irvine, CA, USA, 2005; ISBN 9780071238403. [Google Scholar]
- Alenezi, M. Internal Quality Evolution of Open-Source Software Systems. Appl. Sci. 2021, 11, 5690. [Google Scholar] [CrossRef]
- McCabe, T. A Complexity Measure. IEEE Trans. Softw. Eng. 1976, 2, 308–320. [Google Scholar] [CrossRef]
- Halstead, M. Elements of Software Science; Elsevier: New York, NY, USA, 1977; ISBN 978-0444002051. [Google Scholar]
- Lorenz, M.; Kidd, J. Object-Oriented Software Metrics, 1st ed.; Prentice Hal: St. Kent, OH, USA, 1994; ISBN 013179292X. [Google Scholar]
- Muriana, B.; Onuh, O. Comparison of software complexity of search algorithm using code based complexity metrics. Int. J. Eng. Appl. Sci. Technol. 2021, 6, 24–29. [Google Scholar] [CrossRef]
- Gillberg, A.; Holst, G. The Impact of Reactive Programming on Code Complexity and Readability: A Case Study. Bachelor’s Thesis, Mid Sweden University, Sundsvall, Sweden, 2020. [Google Scholar]
- International Business Machines Corp. Eclipse Platform Technical Overview. 2006. Available online: https://www.eclipse.org/articles/Whitepaper-Platform-3.1/eclipse-platform-whitepaper.pdf (accessed on 6 January 2022).
- Shaik, A.; Reddy, C.; Manda, B.; Prakashini, C.; Deepthi, K. Metrics for Object Oriented Design Software Systems: A Survey. J. Emerg. Trends Eng. Appl. Sci. (JETEAS) 2010, 2, 190–198. [Google Scholar]
- Najadat, H.; Alsmadi, I.; Shboul, Y. Predicting Software Projects Cost Estimation Based on Mining Historical Data. ISRN Softw. Eng. 2012, 2012, 823437. [Google Scholar] [CrossRef] [Green Version]
- Powersoftware. Krakatau Metrics. 2021. Available online: http://www.powersoftware.com/ (accessed on 5 January 2022).
- Hervé, A.; Williams, L. Principal Component Analysis. Wiley Interdiscip. Rev. Comput. Stat. 2010, 2, 433–459. [Google Scholar] [CrossRef]
- IBM. IBM SPSS Statistics. 2021. Available online: https://www.ibm.com/products/spss-statistics (accessed on 2 January 2022).
- Hall, M.; Frank, E.; Holmes, G.; Pfahringer, B.; Reutemann, P.; Witten, I. The WEKA Data Mining Software: An Update. ACM SIGKDD Explor. Newsl. 2009, 11, 10–18. [Google Scholar] [CrossRef]
- Stemler, S. A Comparison of Consensus, Consistency, and Measurement Approaches to Estimating Interrater Reliability. Pract. Assessment, Res. Eval. 2004, 9, 66–78. [Google Scholar] [CrossRef]
K Value | Class 1 | Class 2 | Class 3 | Class 4 | Class 5 |
---|---|---|---|---|---|
K = 2 | 12,105 | 75 | _ | _ | _ |
K = 3 | 11,871 | 296 | 13 | _ | _ |
K = 4 | 11,770 | 390 | 18 | 2 | _ |
K = 5 | 11,328 | 769 | 70 | 11 | 2 |
Complexity Metrics | Component |
---|---|
LOC | 0.81 |
DIT | 0.52 |
LCOM | 0.61 |
OSavg | 0.68 |
RFC | 0.91 |
WMC | 0.96 |
Program volume | 0.80 |
Algorithm | Accuracy | Mean Absolute Error | Precision | Recall | F-Measure |
---|---|---|---|---|---|
Decision tree | 90.07 | 0.13 | 0.90 | 0.90 | 0.90 |
Naïve Bayes | 88.26 | 0.12 | 0.88 | 0.88 | 0.88 |
Bayesian network | 86.50 | 0.14 | 0.88 | 0.87 | 0.87 |
Neural network | 89.74 | 0.15 | 0.90 | 0.90 | 0.90 |
SVM | 89.74 | 0.10 | 0.90 | 0.90 | 0.90 |
Algorithm | Accuracy | Mean Absolute Error | Precision | Recall | F-Measure |
---|---|---|---|---|---|
Decision tree | 90.15 | 0.12 | 0.90 | 0.90 | 0.90 |
Naïve Bayes | 88.58 | 0.12 | 0.89 | 0.89 | 0.89 |
Bayesian network | 87.02 | 0.13 | 0.88 | 0.87 | 0.87 |
Neural network | 90.11 | 0.14 | 0.90 | 0.90 | 0.90 |
SVM | 89.62 | 0.10 | 0.90 | 0.90 | 0.90 |
Algorithm | Accuracy | Mean Absolute Error | Precision | Recall | F-Measure |
---|---|---|---|---|---|
Decision tree | 88.46 | 0.13 | 0.89 | 0.89 | 0.89 |
Naïve Bayes | 86.53 | 0.14 | 0.87 | 0.87 | 0.87 |
Bayesian network | 87.83 | 0.12 | 0.88 | 0.88 | 0.88 |
Neural network | 89.54 | 0.16 | 0.90 | 0.90 | 0.90 |
SVM | 88.58 | 0.11 | 0.89 | 0.89 | 0.89 |
Algorithm | Accuracy | Mean Absolute Error | Precision | Recall | F-Measure |
---|---|---|---|---|---|
Decision tree | 88.51 | 0.13 | 0.89 | 0.89 | 0.89 |
Naïve Bayes | 86.37 | 0.14 | 0.87 | 0.86 | 0.86 |
Bayesian network | 88.26 | 0.12 | 0.88 | 0.88 | 0.88 |
Neural network | 89.23 | 0.15 | 0.89 | 0.89 | 0.89 |
SVM | 88.83 | 0.11 | 0.89 | 0.89 | 0.89 |
Algorithm | Accuracy | Mean Absolute Error | Precision | Recall | F-Measure |
---|---|---|---|---|---|
Decision tree | 85.15 | 0.20 | 0.85 | 0.85 | 0.85 |
Naïve Bayes | 74.21 | 0.26 | 0.81 | 0.74 | 0.72 |
Bayesian network | 82.25 | 0.18 | 0.82 | 0.82 | 0.82 |
Neural network | 81.91 | 0.27 | 0.82 | 0.82 | 0.82 |
SVM | 76.82 | 0.23 | 0.80 | 0.77 | 0.76 |
Algorithm | Accuracy | Mean Absolute Error | Precision | Recall | F-Measure |
---|---|---|---|---|---|
Decision Tree | 85.54 | 0.19 | 0.86 | 0.86 | 0.86 |
Naïve Bayes | 73.14 | 0.27 | 0.80 | 0.73 | 0.71 |
Bayesian network | 82.78 | 0.18 | 0.83 | 0.83 | 0.83 |
Neural network | 82.33 | 0.26 | 0.82 | 0.82 | 0.82 |
SVM | 77.81 | 0.22 | 0.81 | 0.78 | 0.77 |
Algorithm | Accuracy | Mean Absolute Error | Precision | Recall | F-Measure |
---|---|---|---|---|---|
Decision tree | 89.54 | 0.14 | 0.90 | 0.90 | 0.90 |
Naïve Bayes | 74.23 | 0.26 | 0.80 | 0.74 | 0.73 |
Bayesian network | 85.27 | 0.15 | 0.85 | 0.85 | 0.85 |
Neural network | 83.41 | 0.24 | 0.84 | 0.83 | 0.83 |
SVM | 85.27 | 0.15 | 0.85 | 0.85 | 0.85 |
Algorithm | Accuracy | Mean Absolute Error | Precision | Recall | F-Measure |
---|---|---|---|---|---|
Decision Tree | 90.01 | 0.13 | 0.90 | 0.90 | 0.90 |
Naïve Bayes | 76.03 | 0.24 | 0.82 | 0.76 | 0.75 |
Bayesian network | 85.76 | 0.15 | 0.86 | 0.86 | 0.86 |
Neural network | 83.89 | 0.24 | 0.84 | 0.84 | 0.84 |
SVM | 78.92 | 0.21 | 0.82 | 0.79 | 0.79 |
Weak Relation | Strong Positive Relation | Strong Negative Relation |
---|---|---|
Average spaces | Average parenthesis | Average arithmetic |
Average commas | Average assignment | - |
Average periods | Average comparisons | - |
Average for/while | Average if | - |
Average blank lines | Average indent | - |
Average comments | Max character occurrences | - |
Average identifier length | Max indents | - |
Average indents | Max indent (preceding white space) | - |
Average keywords | Max line length | - |
Average line length | Max word | - |
Average numbers | - | - |
Max keywords | - | - |
Max numbers | - | - |
Max word length | - | - |
Algorithm | Accuracy | Mean Absolute Error | Precision | Recall | F-Measure |
---|---|---|---|---|---|
Decision tree | 79.16 | 0.32 | 0.79 | 0.79 | 0.79 |
Naïve Bayes | 73.36 | 0.30 | 0.80 | 0.73 | 0.71 |
Bayesian network | 79.16 | 0.27 | 0.79 | 0.79 | 0.79 |
Neural network | 77.71 | 0.36 | 0.80 | 0.78 | 0.77 |
SVM | 71.09 | 0.29 | 0.80 | 0.71 | 0.68 |
Algorithm | Accuracy | Mean Absolute Error | Precision | Recall | F-Measure |
---|---|---|---|---|---|
Decision tree | 79.66 | 0.32 | 0.80 | 0.80 | 0.80 |
Naïve Bayes | 73.59 | 0.30 | 0.80 | 0.74 | 0.72 |
Bayesian network | 79.53 | 0.27 | 0.80 | 0.80 | 0.80 |
Neural network | 79.20 | 0.31 | 0.80 | 0.79 | 0.80 |
SVM | 73.14 | 0.27 | 0.80 | 0.73 | 0.71 |
Algorithm | Accuracy | Mean Absolute Error | Precision | Recall | F-Measure |
---|---|---|---|---|---|
Decision tree | 79.98 | 0.32 | 0.80 | 0.80 | 0.80 |
Naïve Bayes | 70.13 | 0.32 | 0.78 | 0.70 | 0.68 |
Bayesian network | 79.67 | 0.27 | 0.80 | 0.80 | 0.80 |
Neural network | 77.25 | 0.36 | 0.79 | 0.77 | 0.77 |
SVM | 66.96 | 0.33 | 0.77 | 0.67 | 0.64 |
Algorithm | Accuracy | Mean Absolute Error | Precision | Recall | F-Measure |
---|---|---|---|---|---|
Decision Tree | 80.07 | 0.32 | 0.80 | 0.80 | 0.80 |
Naïve Bayes | 71.39 | 0.31 | 0.78 | 0.71 | 0.70 |
Bayesian network | 79.85 | 0.27 | 0.81 | 0.80 | 0.80 |
Neural network | 78.83 | 0.32 | 0.80 | 0.79 | 0.79 |
SVM | 70.57 | 0.29 | 0.78 | 0.71 | 0.69 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Tashtoush, Y.; Abu-El-Rub, N.; Darwish, O.; Al-Eidi, S.; Darweesh, D.; Karajeh, O. A Notional Understanding of the Relationship between Code Readability and Software Complexity. Information 2023, 14, 81. https://doi.org/10.3390/info14020081
Tashtoush Y, Abu-El-Rub N, Darwish O, Al-Eidi S, Darweesh D, Karajeh O. A Notional Understanding of the Relationship between Code Readability and Software Complexity. Information. 2023; 14(2):81. https://doi.org/10.3390/info14020081
Chicago/Turabian StyleTashtoush, Yahya, Noor Abu-El-Rub, Omar Darwish, Shorouq Al-Eidi, Dirar Darweesh, and Ola Karajeh. 2023. "A Notional Understanding of the Relationship between Code Readability and Software Complexity" Information 14, no. 2: 81. https://doi.org/10.3390/info14020081
APA StyleTashtoush, Y., Abu-El-Rub, N., Darwish, O., Al-Eidi, S., Darweesh, D., & Karajeh, O. (2023). A Notional Understanding of the Relationship between Code Readability and Software Complexity. Information, 14(2), 81. https://doi.org/10.3390/info14020081