2.2. Calculating Metrics for Solidity Programs
A contract in the sense of Solidity is a collection of code (its functions) and data (its state) that resides at a specific address on the Ethereum blockchain. Contracts also support constructors, special functions that are run during the creation of the contract and cannot be called afterward. Given the obvious similarities in structure, it is easy to map contracts to classes, states to attributes, and functions to member operations in the OO world and interpret classic OO metrics for Solidity-based smart contracts.
To calculate these metrics, we implemented a prototype tool called SolMet in Java. For parsing the Solidity source code, we used a generated parser based on a slightly modified version of an existing antlr4 grammar [
23]. We made only a few adjustments (i.e., added proper handling of the underscore operator and language version identifiers) in the antlr grammar to be able to parse older and newer versions of the Solidity language as well. The calculation of metrics was performed either on the Solidity source code directly or by implementing visitors to collect the necessary information from the parser built abstract syntax-tree (AST). The source code and usage instructions of the SolMet tool is available on GitHub [
22]. The tool is able to calculate the source code metrics listed in
Table 1.
SLOC. The source lines of code metric denotes the number of source code lines of the contract, library or interface. It is calculated based on the starting and ending line number of the contracts, libraries, and interfaces provided by the parser.
LLOC. The number of logical lines of code metric counts the non-empty and non-comment lines of the contract, library or interface (i.e., only those lines are counted that contain actual statements). It is calculated based on the Solidity source code of the given contract, library or interface. We scan the source code line by line and filter out all the empty and comment lines to get the LLOC metric. We consider all lines to be comment only lines if they start with “/*”, “*”, “//” or end with “*/”.
CLOC. The comment lines of code metric is the number of comment lines of the contract, library or interface. It is calculated from the Solidity source file with the heuristic described above at LLOC metric.
NF. The number of functions in the contract, library or interface. It is calculated based on the function nodes in the AST, a simple visitor calculates how many function nodes are encountered during the traversal of the parse tree.
McCC. McCC is the McCabe’s cyclomatic complexity [
14] (the number of branching statements + 1) of the functions. We calculate the McCabe complexity by traversing and counting all the branching statements (
if, for, while, do-while) within a function (i.e., in the sub-tree of the AST under the appropriate function node). To represent the McCC value for contracts, we calculate the average of McCC values for all the functions within a contract.
WMC. The weighted method complexity metric is the weighted complexity of functions in a contract, library or interface. For weights, we use the McCC complexity metric of the functions. Therefore, WMC is the sum of the McCC values of the functions within a contract, library or interface.
NL. The nesting level metric denotes the sum of the deepest nesting level of the control structures within the functions of a contract, library or interface. The nesting levels of the individual functions are calculated by visiting all the statements in the sub-tree rooted by the function in question and counting the number of branching statements on the path from the statement to the function definition traversing the parent nodes. The final NL value of a function is the maximum of such nesting level values calculated for the statements within the function. The lowest possible NL value (i.e., where there are no embeddings within a function) is 0. To represent the NL value for contracts, we calculate the average of NL values for all the functions within a contract.
NLE. The nesting level without else-if metric is a variant of the NL metric described above. The only difference in the calculation is that if statements do not add to the nesting level count if they appear in an else-if structure. To represent the NLE value for contracts, we calculate the average of NLE values for all the functions within a contract.
NUMPAR. The number of parameters metric simply counts how many parameters a function has. The calculation of this metric is based on a visitor visiting the function nodes in the AST. To represent the NUMPAR value for contracts, we calculate the average of NUMPAR values for all the functions within a contract.
NOS. The number of statements metric is again a simple size metric that counts how many statements there are in a contract, library or interface. The calculation of this metric is based on a visitor visiting the statement nodes in the AST. To represent the NOS value for contracts, we calculate the average of NOS values for all the functions within a contract.
DIT. The depth of inheritance metric measures how deep a contract, library or interface is in the inheritance tree. We calculate this metric with a recursive algorithm that assigns a DIT value to each contract, library or interface that is the maximal DIT measure of its parents plus 1. Since Solidity supports multiple inheritance, we always take the deepest path through the inheritance tree. The DIT values of parent nodes are calculated recursively until we reach a node that has no parents, hence gets a DIT value of 0.
NOA. The number of ancestors metric counts all the different direct or transitive ancestors of a contract, library or interface. This measure tells us how many different contracts one inherits information from. The metric is calculated by traversing the parse tree through the inheritance relations and counting all the different contracts, libraries or interfaces encountered on the path.
NOD. The number of descendants metric measures how many different direct or transitive descendants a contract, library or interface has. It is measured with a recursive algorithm similar to that of DIT. The NOD value of each node in the inheritance list is incremented by one if the visited contract, library or interface derives from that node.
CBO. The coupling between object classes metric in the OO paradigm measures the number of classes that the actual class is connected to (by using the class as an attribute type, method parameter or return value, etc.). We abstracted this concept here to reflect how the contracts are connected to each other, namely how many other types of contracts are used by a particular contract (as state variable type, local variable type, function parameter, etc.). The metric is calculated by visiting and counting the user-defined type name nodes in the AST, which are the usage points of the other user-defined types (i.e., contracts, libraries or interfaces) within a contract.
NA. The number of attributes metric is adapted to Solidity to count the number of state variables. This metric counts all the state variable declaration nodes in the AST belonging to a contract, library or interface.
NOI. The number of outgoing invocations metric measures how many different functions are called from a function in a contract or library. This metric is meant to measure coupling through function calls. We consider only the calls to user-defined functions, and not to built-in functions, but we include modifier and event calls as well. The metric is calculated by visiting the call nodes in the AST. To represent the NOI value for contracts, we calculate the average of NOI values for all the functions within a contract.