A Novel Model for Vulnerability Analysis through Enhanced Directed Graphs and Quantitative Metrics

The rapid evolution of industrial components, the paradigm of Industry 4.0, and the new connectivity features introduced by 5G technology all increase the likelihood of cybersecurity incidents. Such incidents are caused by the vulnerabilities present in these components. Designing a secure system is critical, but it is also complex, costly, and an extra factor to manage during the lifespan of the component. This paper presents a model to analyze the known vulnerabilities of industrial components over time. The proposed Extended Dependency Graph (EDG) model is based on two main elements: a directed graph representation of the internal structure of the component, and a set of quantitative metrics based on the Common Vulnerability Scoring System (CVSS). The EDG model can be applied throughout the entire lifespan of a device to track vulnerabilities, identify new requirements, root causes, and test cases. It also helps prioritize patching activities. The model was validated by application to the OpenPLC project. The results reveal that most of the vulnerabilities associated with OpenPLC were related to memory buffer operations and were concentrated in the libssl library. The model was able to determine new requirements and generate test cases from the analysis.


I. INTRODUCTION
Industrial components are the driving force of almost every industrial field, such as automotive, manufacturing, telecommunications, energy production, transportation, healthcare, and defense [1]- [6].These types of components are rapidly evolving [7], [8] and are rapidly increasing in number [9]: • Open-source hardware and software, and Commercial Off- The-Shelf (COTS) components are being integrated to speed up their development.
• They are increasingly connected, providing more advanced connectivity features, enabling new automation applications, services, and data exchange.• The complexity of industrial systems is also increasing, thus the complexity of their software and hardware is also increasing.
The reuse of open-source software components is a de facto industry norm, with 90% of the participants using pre-existing code [10]- [13].Moreover, Commercial Off-The-Shelf (COTS) components are highly available in the market, which makes them suitable for speeding up the development of industrial components [14].COTS components are used for hardware, software, and communication interfaces.Thus, vulnerabilities within such components can create potential entry points for malicious adversaries aiming to disrupt CPS operations [15].In addition, the use of COTS components makes it easier than ever for industrial components to connect among them and to the Internet, increasing their attack surface even further [16].
Interconnected systems significantly increase the exposure to many security risks, with critical, environmental and wellbeing impacts.The fifth generation (5G) of wireless technology for cellular networks will facilitate an enhanced Internet connectivity and accommodate the connection of multiple devices through the IoT architecture, which will open further the window of exposure to any threat [6], [9], [17], [18].Furthermore, this increased connectivity is part of the new paradigm of Industry 4.0.Thus, technologies such as the Internet of Things (IoT) [18]- [21], cloud computing, Artificial Intelligence (AI) [21], [22], and big data (among others) are being extensively used [23]- [28], which will again open even further the door for potential attacks.
Industrial components are working on an evolving ecosystem that becomes more complex over time: the use of COTS components and open-source software, the enhanced connectivity among them and to the Internet, and the integration of technologies such as IoT, AI, and big data.To face this scenario, industrial components are also becoming more complex.
Complexity is a critical aspect of industrial components design, because it is closely related to the number of vulnerabilities [29], [30].
Security is turning into a key issue in an environment where security is traditionally addressed as Security through obscurity, and treated as an add-on feature instead of being a priority during the development stage [31]- [34].Numerous attacks have been reported targeting industrial enterprises across the globe since 2010 [35], and an exponential rise in such attacks P R E P R I N T is predicted for the upcoming years [36], [37].
Under this scenario, a model for continuous vulnerability assessment is needed to manage the security of existing industrial components during their lifespan to deal with emerging threats [38], [39].That is to say, performing a vulnerability analysis at a single point in time (e.g., during development or when a product has been released) is not enough for industrial components and their large lifespan has to be considered [40], [41].Such a model should be able to compute the number and distribution of vulnerabilities among the assets of an industrial component, to detect and classify security vulnerabilities for further remediation or mitigation actions.Moreover, this model should be able to integrate other metrics, such as the severity value of vulnerabilities.Severity is closely related to risk assessment and threat model activities.So this value can help in early steps (e.g., design).By incorporating information on their severity, more metrics could be developed to track and enhance the development of industrial components, which will create a more precise threat model [42].Metrics help in prioritizing the patching of vulnerabilities, obtaining their root causes, and monitoring the evolution of industrial components, not only during their development, but also throughout their lifespan.Finally, the model should be aligned with the most relevant cybersecurity standards to enhance evaluation tasks that they propose or even to cover gaps.Furthermore, both software and hardware should be evaluated because it is of high importance in industrial components, given that the strong bonding between hardware and software is an intrinsic feature of industrial components [14], [43]- [45].
Although great efforts are being made to develop new and better ways to analyze vulnerabilities [46], [47], to measure them (e.g., Common Vulnerabilities and Exposures (CVE), Common Vulnerability Scoring System (CVSS) [48]- [51], or Common Weakness Enumeration (CWE) [52], [53], or to aggregate them [54]), to the best of our knowledge, existing models do not cover industrial components.To set the first steps toward filling this gap, this research work proposes a model with the aim of performing a continuous vulnerability assessment to determine the source and nature of vulnerabilities, and enhance the security of industrial components.
The proposed model is built on top of a directed graph-based structure, and a set of metrics based on globally accepted security standards.This model is intended to be used to help manage the analysis of known vulnerabilities, their root causes, and their impact on the entire life cycle of an industrial component.The internal structure, in terms of assets, of an industrial component is represented using directed graphs: different types of nodes represent vulnerabilities for both software and hardware assets.For further analysis, metrics were developed integrating internationally recognized security standards, such as Common Platform Enumeration (CPE) [55]- [57], CVE [49], CVSS [51], CWE [52], and Common Attack Pattern Enumeration and Classification (CAPEC) [58], [59].By using the proposed metrics, the evaluator is capable of identifying the source and nature of the detected vulnerabilities, and help in prioritizing their patching.Finally, the work presented here is also aligned with the ISA/IEC 62443 standard "Security for industrial automation and control systems".
This paper is structured as follows: First, the related work is reviewed in Section II.Then, the main pieces of the proposed model are defined in Section III.Second, to demonstrate the potential of this proposal, the proposed model is applied to a real use case (Section IV.Finally, conclusions and future work of this research are described in Section V.

II. RELATED WORK
This section will review the current status of vulnerability assessment.This review aims to find similar approaches from the literature, including the current standard and metrics.

A. Vulnerability Analysis in Security Standards
Industry is currently making a significant effort to incorporate security aspects into the development of industrial components, which has led to a set of standards, such as the ISA/IEC 62443.ISA/IEC 62443 is a family of standards which includes several parts, where Part 4 is focused on the specific development of components (e.g., embedded systems, host systems, network devices, and software applications).This standard, in which certain parts are not yet fully defined, has been inspired by previous standards that have a long tradition of use, such as the Common Criteria.Consequently, this section will not only describe the ISA/IEC 62443 standard, but will also analyze the Common Criteria.This review is focused on how these standards conduct vulnerability analysis, the use of metrics, their management of the life cycle of the device, the techniques that they propose, and the security evaluation of both software and hardware.1) ISA/IEC 62443: Based on the ISA-99 document, the ISA/IEC 62443 constitutes a series of standards, technical reports, and related information that define the procedures and requirements for implementing electronically secure Industrial Automation and Control Systems (IACSs) [60].As expressed by this standard, security risk management shall jointly and collaboratively be addressed by all the entities involved in the design, development, integration, and maintenance of the industrial and/or automation solution (including subsystems and components) to achieve the required security level [61].
This joint effort is reflected in the organization of the documents of the standard, which is divided into four parts: 1) Part 1 -General: Provides background information such as security concepts, terminology and metrics; 2) Part 2 -Policies and procedures: Addresses the security and patch management policies and procedures; 3) Part 3 -System: Provides system development requirements and guidance; 4) Part 4 -Component: Provides product development and technical requirements, which are intended for product vendors.

P R E P R I N T
The ISA/IEC 62443-4-1 technical document is divided into eight practices, which specify the secure product development life cycle requirements for both the development and the maintenance phases [62]."Practice 5 -Security verification and validation testing" (SVV) section of this document specifies that a process shall be employed to identify and characterize potential security vulnerabilities in the product, including known and unknown vulnerabilities [63], [64].Two requirements in Practice 5 are in charge of the task of analyzing vulnerabilities, as follows: • Requirement SVV-3.Vulnerability Testing [62].This requirement states that a process shall be employed to perform tests that focus on identifying and characterizing potential and known security vulnerabilities in the product (i.e., fuzz testing, attack surface analysis, black box known vulnerability scanning, software composition analysis, and dynamic runtime resource management testing).• Requirement SVV-4.Penetration Testing [62].This requirement states that a process shall be employed to identify and characterize security-related issues via tests that focus on discovering and exploiting security vulnerabilities in the product (i.e., penetration testing).
Although the ISA/IEC 62443-4-1 document considers the possibility of analyzing and characterizing the vulnerabilities of an industrial component, it does not propose a technique to perform this task, but instead refers to other standards for vulnerability handling processes [65].In addition, it does not indicate how the data obtained from the analysis should be interpreted, and it does not define metrics or reference values for the current state of compliance with the requirement.Finally, it does not take into account neither the dependencies among the assets of the industrial component (dependency trees), nor their evolution of the number of vulnerabilities over time.
2) Common Criteria: The Common Criteria (CC) for Information Technology Security Evaluation (ISO/IEC 15408) is an international standard that has a long tradition in computer security certification [66].CC is a framework which provides assurance that the processes of specification, implementation, and evaluation of a computer security product have been conducted in a rigorous, standard, and repeatable manner at a level that is commensurate with the target environment for use.
To describe the rigor and depth of an evaluation, the CC defines seven Evaluation Assurance Levels (EALs) on an increasing scale [66], from EAL1 (the most basic) to EAL7 (the most stringent security level).It is important to notice that the EAL levels do not measure security itself.Instead, emphasis is given to functional testing, confirming the overall security architecture and design, and performing some testing techniques (depending on the EAL to be achieved).
The CC defines five tasks in the Vulnerability Assessment class, which manage the deepness of the vulnerability assessment.
The higher the EAL to be achieved, the greater the number of tasks in the list to be performed [67]: 1) Vulnerability survey, 2) Vulnerability analysis, 3) Focused vulnerability analysis, 4) Methodical vulnerability analysis, and 5) Advanced methodical vulnerability analysis.
Every task checks for the presence of publicly known vulnerabilities.Penetration testing is also performed.The main difference among the five levels of vulnerability analysis described here is the deepness of the analysis of known vulnerabilities and the penetration testing.
The CC scheme defines the general activities, but it does not specify how to perform them, therefore no technique for analyzing vulnerabilities is proposed.The evaluator decides the most appropriated techniques for each test in each scenario and for each device, which adds a large degree of subjectivity to the evaluation.Furthermore, dependencies among vulnerabilities and assets are not considered in the analysis.Moreover, the CC does not define a procedure to manage the life cycle of the device.In other words, when updated, the whole device has to be reevaluated [31], [68]- [70].Finally, although the usage of metrics is encouraged by the CC, it does not propose any explicitly defined metric to be used during the evaluation.

B. Vulnerability Analysis Methodologies
Vulnerability analysis is a key step towards the security evaluation of a device.Consequently, many research efforts have been focused on solving this issue.In this subsection, the most relevant works related to vulnerability analysis are reviewed.
Homer et al. [71] present a quantitative model for computer networks that objectively measures the likelihood of a vulnerability.Attack graphs and individual vulnerability metrics, such as CVSS, and probabilistic reasoning are applied to produce a sound risk measurement.However, the main drawback is that their work is only applicable to computer networks.Although they propose new metrics based on the CVSS for probabilistic calculations, they do not integrate standards such as CAPEC to enhance their approach centered on possible attacks and privilege escalation.They also fail to establish a relationship among existing vulnerabilities, and they fail to obtain the source problem causing each vulnerability.
Zhang et al. [72], [73] developed a quantitative model that can be used to aggregate vulnerability metrics in an enterprise network based on attack graphs.Their model measures the likelihood that breaches can occur within a given network configuration, taking into consideration the effects of all possible interplays between vulnerabilities.This research is centered on computer networks, using attack graphs.Although the proposed model is capable of managing shared dependencies and cycles, only CVSS-related metrics are used.Moreover, this model assumes that the attacker knows all of the information in the P R E P R I N T generated attack graphs.Finally, the method that they proposed for the aggregation of metrics is only valid for attack graphs, and is not valid for vulnerability analysis.
George et al. [74] propose a graph-based model to address the security issues in Industrial IoT (IIoT) networks.Their model is useful because it represents the relationships among entities and their vulnerabilities, serving as a security framework for the risk assessment of the network.Risk mitigation strategies are also proposed.Finally, the authors discuss a method to identify the strongly connected vulnerabilities.However, the main drawback of this work is that each node of the generated attack graph represents a vulnerability instead of representing a device or an asset of that device.This leads to a loss of information in the analysis, because there is no way to know which vulnerability belongs to which device.Moreover, these methods need to know the relationships among present vulnerabilities in the devices.This information is not trivially obtained, and a human in the loop is needed.The proposals of [75] and [76] follow a similar graph-based approach to study the effects of cascade failures in the power grid, and a subway network.
Poolsappasit et al. [77] propose a risk management framework using Bayesian networks that enables a system administrator to quantify the chances of network compromise at various levels.
The authors are able to model attacks on the network, and also to integrate standardized information of the vulnerabilities involved, such as their CVSS score.Although their proposed model lends itself to dynamic analysis during the deployed phase of the network, these results can only be applied to computer networks.Meanwhile, the prior probabilities that are used in the model are assigned by network administrators, and hence are subjective.The proposed model also has some issues related to scalability.Specifically, they demonstrate a threat modeling methodology to accurately represent the CPS elements, their interdependencies, as well as the possible attack entry points and system vulnerabilities.They present a CPS framework that is designed to delineate the hardware, software, and modeling resources that are required to simulate the CPS.They also construct high-fidelity models that can be used to evaluate the system's performance under adverse scenarios.The performance of the system is assessed using scenario-specific metrics.Meanwhile, risk assessment enables system vulnerability prioritization, while factoring the impact on the system's operation.Although this research work is comprehensive, it is focused on enhancing the existing adversary and attack modeling techniques of CPSs of the energy industry.Moreover, their model does not integrate the internal structure of the target of evaluation, and it does not take both software and hardware into account for the evaluation.Continuous evaluation over time is not considered.Finally, they do not propose countermeasures, or any kind of mechanism to enhance the security or the development of the CPSs.

Muñoz
Most of the works reviewed here are more focused on modeling threats and attacks.They do not propose solutions to protect CPSs, enhancing their development, or manage their update throughout their whole life cycle.It is worth noting that they are still more focused on software evaluation, while hardware is usually neglected in their proposals.
As shown in this review, most of the research has adopted dependency trees, attack graphs, or directed graphs as the main tool to manage and assess vulnerabilities in computer networks.Graphs are an efficient technique to represent the relationships between entities, and they can also effectively encode the vulnerability relations in the network.Furthermore, the analysis of the graph can reveal the security-relevant properties of the network.For fixed infrastructure networks, graphical representations, such as attack graphs, are developed to represent the possible attack paths by exploiting the vulnerability relationships.For these reasons, vulnerability analysis techniques based on directed graphs are frequently found in the literature [82].However, despite their potential, these analysis techniques have been P R E P R I N T relegated to vulnerability analysis in computer networks.Graphbased analysis has rarely been applied to industrial components.

C. Security Metrics
Standards of measurement and metrics are a powerful tool to manage security and for making decisions [83]- [85].If carefully designed and chosen, metrics can provide a quantitative, repeatable, and reproducible value.This value is selected to be related to the property of interest of the systems under test (e.g., number and distribution of vulnerabilities).The use of metrics enables results to be compared over time, and among different devices.In addition, they can also be used to systematically improve the security level of a system, or to predict this security level in a future point in time.
Although the capabilities of metrics have been demonstrated, they are not free of drawbacks.In our previous research work [85], we performed a systematic review of the literature and standards.To detect possible gaps, our objective was to find which types of metrics have been proposed and in which fields have been applied.This research work concludes that, in general, standards encourage the use of metrics, but they do not usually propose any specific set of metrics.If metrics are proposed, then they are conceived to be applied at a higher level (i.e., organization level), and then cannot be applied to industrial components.This type of metric is usually related to measuring the return on security investment, security budget allocation, and reviewing security related documentation.
Our previous results also highlight that scientific papers have focused their efforts on software-related metrics: 77.5% of the analyzed metrics were exclusively applicable to software (e.g., lines of code, number of functions and so on), whereas only 0.6% were related exclusively to hardware (e.g., side-channel vulnerability factor metric).In addition, 14.8% of them could be applied to both software and hardware (e.g., the historically exploited vulnerability metric that measures the number of vulnerabilities exploited in the past), and the remaining 7.1% are focused on other aspects, such as user usability.This shows that there is a clear lack of hardware security metrics in the literature, and the main contributions are centered in software security.
Other research works also reveal common problems across security metrics [86], [87]: • Hardly any security metric has a solid theoretical foundation or empirical evidence in support of the claimed correlation.• Many security metrics lack an adequate description of the scale, unit, and reference values to compare and interpret the results.• Only a few implementations or programs were available to test these security metrics, and only one of the analyzed papers performed some kind of benchmarking or comparison with similar metrics.
• The information provided in the analyzed papers is insufficient to understand whether the proposed metrics are applicable in a given context, or how to use them.
Under this scenario, it seems reasonable that future research should be focused on the development of a convincing theoretical foundation, empirical evaluation, and systematic improvement of existing approaches, in an attempt to solve the lack of widely-accepted solutions.In this research work, metrics constitute a key element.They are developed to analyze the distribution of vulnerabilities and to track their evolution over time.

III. PROPOSED APPROACH
In this research work, we propose a model for the continuous assessment of vulnerabilities over time in industrial components.
The proposed model is intended to: • Identify the root causes and nature of vulnerabilities, which will enable the extraction of new requirements and test cases.• Support the prioritization of patching.
• Track vulnerabilities during the whole lifespan of industrial components.• Support the development and maintenance of industrial components.
To accomplish this task, the proposed model comprises two basic elements: the model itself, which is capable of representing the internal structure of the system under test; and a set of metrics, which allow conclusions to be drawn about the origin, distribution, and severity of vulnerabilities.Both the model and metrics are very flexible and exhibit some properties that make them suitable for industrial components, and can also be applied to enhance the ISA/IEC 62443 standard.
The content in this section is distributed into four sections, namely: 1) Model: The proposed model is explained, together with the systems in which it can be applied and the algorithms that are used to built it.2) Metrics: Metrics are a great tool to measure the state of the system and to track its evolution.The proposed metrics and their usage are described in this section.3) Properties: The main features of the proposed model and metrics (e.g., granularity of the analysis, analysis over time, and patching policy prioritization support) are described in detail.4) Applicability: Even though the reviewed standards exhibit some gaps, the proposed model aims to serve as the first step towards generating a set of tools to perform a vulnerability analysis in a reliable and continuous way.This last section will discuss the requirements of the ISA/IEC 62443-4-1 that can be enhanced using our model.

A. Description of the Model
The model that is proposed in this research work is based on directed graphs, and requires knowledge of the internal structure of the device to be evaluated (i.e., the assets, both hardware and software, that comprise it and the relationships between them).This section defines the most basic elements that make up the model, the algorithms to build it for any give system, and its graphical representation.
Definition III.1.A System Under Test (SUT 1 ) is now represented by an Extended Dependency Graph (EDG) model G = ( A, V , E) that is based on directed graphs, where A and V represent the nodes of the graphs, and E represents its edges or dependencies: • A = {a 1 , ..., a n } represents the set of assets in which the SUT can be decomposed, where n is the total number of obtained assets.An asset a is any component of the SUT that supports information-related activities, and includes both hardware and software [88]- [90].Each asset is characterized by its corresponding CPE identifier, while its weaknesses are characterized by the corresponding CWE identifier.In the EDG model, the assets are represented by three types of nodes in the directed graphs (i.e., root nodes, asset nodes and cluster).• V = {v 1 , ..., v q } represents the set of known vulnerabilities that are present in each asset of A, where q is the total number of vulnerabilities.They are characterized by the corresponding CVE and CVSS values.In the EDG model, vulnerabilities are represented using two types of nodes in the directed graphs (i.e., known vulnerability nodes and clusters).
represents the set of edges or dependencies among the assets, and between assets and vulnerabilities.e ij indicates that a dependency relation is established from asset a i to asset a j .Dependencies are represented using two different types of edges in the EDG (i.e., normal dependency and deprecated asset/updated vulnerability edges).
In other words, the EDG model can represent a system, from its assets to its vulnerabilities, and its dependencies as a directed graph.Assets and vulnerabilities are represented as nodes, whose dependencies are represented as arcs in the graph.The information in the EDG is further enhanced by introducing metrics.
The EDG model of a given SUT will include four types of node and two types of dependency.The graphical representation for each element is shown in Table I.Fig. 1 shows an example of a simple EDG and its basic elements.All of the elements that make up an EDG will be explained in more detail below: 1 Following the denomination in the ISA/IEC 62443 standard [60], the SUT may be an industrial component, a part of an industrial component, a set of industrial components, a unique technology that may never be made into a product, or a combination of these. 1) Types of Node: The EDG model uses four types of node: • Root nodes represent the SUT, • Asset nodes represent each one of the assets of the SUT, • Known vulnerability nodes represent the vulnerabilities in the SUT, and • Clusters summarize the information in a subgraph.
Root nodes (collectively, set G R ) are a special type of node that represent the whole SUT.Any EDG starts in a root node and each EDG will only have one single root node, with an associated timestamp (t) that indicates when the last check for changes was done.This timestamp is formatted following the structure defined in the ISO 8601 standard for date and time [91].
Asset nodes (collectively, set G A ) represent the assets that comprise the SUT.The EDG model does not impose any restrictions on the minimum number of assets that the graph must have.However, the SUT can be better monitored over time when there is a higher number of assets.Moreover, the results and conclusions obtained will be much more accurate.Nevertheless, each EDG will have as many asset nodes as necessary, and the decomposition of assets can go as far and to as low-level as needed.
Each known vulnerability node will be characterized by the following set of values: • CP E current : Current value for the CPE.This points to the current version of the asset it refers to.
• CP E previous : Value of the CPE that identifies the previous version of this asset.This will be used by the model to trace back all the versions of the same asset over time, from the current version, to the very first version.
• CW E ai (t): Set of all the weaknesses that are related to the vulnerabilities present in the asset.The content of this list can vary depending on the version of the asset.
P R E P R I N T  Known vulnerability nodes (collectively, set G V ) represent a known vulnerability present in the asset that it relates to.Each asset will have a known vulnerability node for each known vulnerability belonging to that asset.Assets alone cannot tell how severe or dangerous the vulnerabilities might be, so unique characterization of vulnerabilities is crucial [74].
To identify each known vulnerability node, each will be characterized by the following set of features (formally defined in III-B: • CV E ai (t): This serves as the identifier of a vulnerability of asset a i .• CV SS vi (t): This metric assigns a numeric value to the severity of vulnerability v i .Each CVE has a corresponding CVSS value.• CAP EC wi (t): Each vulnerability (CVE) is a materialization of a weakness (CWE) w i that can be exploited using a concrete attack pattern (CAPEC).In many cases, each CWE has more than one CAPEC associated.Consequently, this field is a set that contains all the possible attack patterns that can exploit the vulnerability that is being analyzed.
Clusters (collectively, set G S ) are a special type of node that summarizes and simplifies the information contained in a subgraph in an EDG.Fig. 3 shows how the clusters work.
To identify each cluster, and to be able to recover the information that they summarize, each is characterized by the data that define each of the elements that they contain: {CP E previous , CP E current , CW E ai (t)}, (CV E ai (t), CV SS vi (t), {CAP EC wi (t))}, and their dependencies.
Two types of criteria can be used to create clusters and to simplify the obtained graph (Fig. 3: 1) Absence of vulnerabilities: Using this criterion, clusters will group all nodes that contain no associated vulnerabilities.2) CVSS score below a certain threshold: With this criterion, a threshold for the CVSS scores will be chosen.Nodes whose CVSS score is less than the defined threshold will be grouped into a cluster.
2) Types of Edge: In the EDG model, edges plays a key role representing dependencies.Two types of edge can be identified: • Normal dependencies relate two assets, or an asset and a vulnerability.They represent that the destination element depends on the source element.Collectively, they are known as set G D .• Deprecated asset or patched vulnerability dependencies indicate when an asset or a vulnerability is updated or patched.They represent that the destination element used to depend on the source element.Collectively, they are known as set G U .
The possibility of representing old dependencies brings the opportunity to reflect the evolution of the SUT over time.
When a new version of an asset is released, or a vulnerability is patched, the model will be updated.Their dependencies will change then from a normal dependency to a deprecated asset or vulnerability dependency to reflect that change.

3) Conditions of Application of EDGs:
The EDG model is applicable to SUTs that meet the following set of criteria: P R E P R I N T  (2) Choosing the absence of vulnerability as the criterion to create clusters (lower side).The severity value (CVSS) for v 211 and v 212 is supposed to be lower than the establish threshold.
• Software and hardware composition: In our approach, the model is created by means of a white-box analysis.
The absence or impossibility to perform a white-box analysis limits the ability to create an accurate model.Some knowledge about the internal structure and code is expected.This information is usually only known by the manufacturer of the component, unless the component is publicly available or open-source.It should be also possible to decompose the SUT into simpler assets to generate a relevant EDG.• Existence of publicly known vulnerabilities: The EDG model focuses on known vulnerabilities.This is not critical because many industrial components use commercial or open-source elements.The SUT must be composed of assets for which public information is available.If the majority of SUT assets are proprietary, or the SUT is an ad hoc development that is never exposed, then the generated EDG will not evolve.Therefore, the analysis will not be relevant.

4)
Steps to Build the Model: This section explains the process and algorithms that were used to build the corresponding EDG of a given SUT.The main scenarios that can be found are also described.
Before extracting useful information about the SUT, the directed graph associated with the SUT has to be built.This comprises several steps, which are described in the following paragraphs (see the flowchart in Fig. 4, and Fig. 5): Step 1 -Decompose the SUT into assets.For the model to work properly, it relies on the SUT being able to be decomposed into assets.With this in mind, the first step involves obtaining the assets of the SUT, either software or hardware.
In the CC, this process is called modular decomposition of the SUT [66].Ideally, every asset should be represented in the decomposition process, but this is not compulsory for the model to work properly.Each one of the assets obtained in this step will be represented as an asset node.In this step, the dependencies among the obtained assets are also added as normal dependencies.
Step 2 -Assign a CPE to each asset.Once the assets and their dependencies have been identified, the next task is to assign the corresponding CPE identifier to each asset.If there is no publicly available information of a certain asset, and therefore, it does not have a CPE identifier, then it is always possible to generate one using the fields described in the CPE naming specification documents [56] for internal use in the P R E P R I N T model.
Step 3 -Add known vulnerabilities to the assets.In this step, the vulnerabilities (CV E ai (t)) of each asset are set.This is done by consulting public databases of known vulnerabilities [49], [92] looking for existing vulnerabilities for each asset.When a vulnerability is found, it is added to the model of the SUT, including its dependencies.If there were no known vulnerabilities in an asset, then the asset would become the last leaf of its branch.In this step, the corresponding value of the CVSS of each vulnerability is also added to the model.
Step 4 -Assign to each asset its weaknesses and possible CAPECs.After the vulnerabilities, the corresponding weaknesses to each vulnerability (CW E ai (t)) are added, along with the corresponding attack patterns (CAP EC wi (t)) for each weakness.If there is no known vulnerability in an asset, then there will be no weaknesses.Meanwhile, it would be possible to have a known vulnerability in an asset, but no known weakness or attack pattern for that vulnerability.Finally, more than one CAPEC can be assigned to the same weakness.Consequently, it would be common to have a set of possible CAPECs that can be used to exploit the same weakness.It is worth noting that not all of them could be applied in every scenario.
Step 5 -Computing Metrics and tracking the SUT.At this point, the EDG of the SUT is completed with all the public information that can be gathered.This last step is to calculate the metrics defined (for further information, see Section III-B.), generating the corresponding reports, and tracking the state of the SUT for possible updates in the information of the model.This step is always triggered when the SUT is updated.This can imply that a new asset can appear, an old asset can disappear, an old vulnerability can be patched, or a new one can appear in the SUT.All of these scenarios will be reflected in the model as they arise during its life cycle.

B. Security Metrics
The EDG model that was proposed in the previous sections is by itself capable of representing the internal structure of the SUT, and it can display it graphically for the user.This representation not only includes the internal assets of the SUT, but it also captures their relationships, existing vulnerabilities, and weaknesses.Moreover, assets, vulnerabilities, and weaknesses are easily identified using their corresponding CPE, CVE, and CWE values, respectively.All together, this constitutes a plethora of information that the model can use to improve the development and maintenance steps of the SUT, enhance its security, and track its status during its whole life cycle.Metrics are a great tool to integrate these features into the model.
Metrics can serve as a tool to manage security, make decisions, and compare results over time.They can also be used to systematically improve the security level of an industrial component or to predict its security level in a future point in time.In this section, the basic definitions that serve as the foundation of the metrics are described.Then, the proposed metrics are introduced to complement the functionality of the EDG model.The main feature of these metrics is that they all depend on time as a variable, so it is possible to capture the actual state of the SUT, track its evolution over time, and compare the results.1) Basic Definitions: In this section, the basic concepts on which the definitions of the metrics will be based are formalized.

P R E P R I N T
Definition III.2.The set of all possible weaknesses at a time t is represented as CW E(t), where CW E(t) = {cwe 1 , ..., cwe m }, and where m is the total number of weaknesses at time t.This set contains the whole CWE database defined by MITRE [52].
Definition III.3.The set of all of the possible vulnerabilities at a time t is represented as CV E(t), where CV E(t) = {cve 1 , ..., cve p }, and where p is the total number of vulnerabilities.This set contains the whole CVE database defined by MITRE [49].
Definition III.4.The set of all possible attack patterns at a time t is represented as CAP EC(t), where CAP EC(t) = {capec 1 , ..., capec q }, and where q is the total number of attack patterns at time t.This set contains the whole CAPEC database defined by MITRE [58].
Definition III.5.The set of weaknesses of an asset a i at a time t is defined as CW E ai (t) = {cwe j |cwe j is in the asset a i at time t ∧ cwe j ∈ CW E(t) ∧ ∀k = j, cwe j = cwe k }.From this expression, the set of all the weaknesses of a particular asset throughout its life cycle is defined as CW E ai = T t=1 CW E ai (t) where |CW E ai | is the total number of non-repeated weaknesses in its entire life cycle.
Definition III.6.The set of vulnerabilities of an asset a i at a time t is defined as CV E ai (t) = {cve j |cve j is in the asset a i at time t ∧ cve j ∈ CV E(t)}.From this expression, the set of vulnerabilities of an asset throughout its entire life cycle is defined as CV E ai = T t=1 CV E ai (t) where |CV E ai | is the total number of vulnerabilities in its entire life cycle.
Definition III.7.The set of weaknesses of a SUT A with n assets at a time t is defined as: Definition III.8.The set of vulnerabilities of a SUT A with n assets at a time t is defined as: Definition III.9.The set of vulnerabilities associated to the weakness cwe j and to the asset a i at a time t is defined as: CV E ai|cwej (t) = {cve k |cve k associated to weakness cwe j and to asset a i at time t} It is worth noting that CWE is used as a classification mechanism that differentiates CVEs by the type of vulnerability that they represent.A vulnerability will usually have only one associated weakness, and weaknesses can have one or more associated vulnerabilities [93].
Definition III.10.The partition j of an asset a i at time t conditioned by a weakness cwe k is defined as Definition III.11.The partition j of the SUT A at time t conditioned by a weakness cwe k is defined as Definition III.12.The set of attack patterns associated to a weakness w i at a time t is defined as CAP EC wi (t) = {capec j |capec j can exploit weakness w i at time t∧capec j ∈ CAP EC(t)}.
Definition III.13.M = {m 1 , ..., m r } represents the set of metrics that are defined in this research work based on the EDG model, where r is the total number of metrics.This set can be extended, defining more metrics according to the nature of the SUT.
2) Metrics: This section will describe the metrics that were defined based on the EDG model and the previous definitions.
Although it might seem trivial, the most interesting feature of these metrics is that they all depend on time.Using time as an input variable for the computation of the metrics opens the opportunity to track results over time, compare them, and analyze the evolution of the status of the SUT.Furthermore, some metrics take advantage of time to generate an accumulated value, giving information about the life cycle of the SUT.Table II shows all of the proposed metrics, their definition, and their reference values.
In addition to the metrics in Table II, the model allows the definition of other types of metrics according to the analysis to be performed, and the nature of the SUT (e.g., the vulnerability evolution function for SUT A up to time t for all vulnerabilities can be defined as the linear regression of the total number of vulnerabilities in each time t for SUT A).

C. Properties
Together, the EDG model and the defined metrics exhibit a series of characteristics that make them suitable for vulnerability assessment.These properties represent an advantage over the techniques reviewed in the state of the art, including automatic inference of root causes, spatial and temporal distribution of vulnerabilities, and prioritization of patching, which will be described in the following subsections.1) Automatic Inference of Root Causes: Each CWE natively contains information that is directly related to the root cause of a vulnerability.From this information, new requirements and test cases can be proposed.

2) Spatial and Temporal Distribution of Vulnerabilities:
The key feature of the proposed model is the addition of the temporal dimension in the analysis of vulnerabilities.This makes it possible to analyze the location of the vulnerabilities both in space (in which asset) and time (their recurrence), which allows us to track the state of the device throughout the whole life cycle.This approach also enables a further analysis of the SUT, by updating data in the model, such as new vulnerabilities that are found or new patches that are released.
Each time that a new vulnerability is found, or an asset is patched (i.e., via an update), the initial EDG is updated to reflect those changes.An example of this process can be seen in Fig. 6.
At time t 0 , the initial graph of the SUT A is depicted in Fig. 6.Because there is no vulnerability at that time, this graph can be simplified using the cluster notation, with just a cluster containing all assets.At time t 1 , a new vulnerability that affects the asset a 2 is discovered.At time t 2 , the asset a 2 is updated.This action creates a new version of asset a 2 , asset a 3 .Because the vulnerability was not corrected in the new update, both versions contain the vulnerability that was initially presented in asset a 2 .Finally, at time t 3 , the asset a 3 is updated to its new version a 4 , and the vulnerability is corrected.
This approach enables a further analysis of the SUT, including updated data, according to new vulnerabilities that are found or new patches that are released.
3) Patching Policies Prioritization Support: The proposed model is not only able to include known vulnerabilities associated with an asset, but it also provides a relative importance sorting of vulnerabilities by CVSS.Relying on the resulting value, it is possible to assist in the vulnerability patching prioritization process.Furthermore, the presence of an existing exploit for a known vulnerability can be also be taken into account, when deciding which vulnerabilities need to be patched first.A high CVSS value combined with an available exploit for a given vulnerability is a priority when patching.

D. Applicability in the Context of ISA/IEC 62443
In this section, the potential application of the proposed EDG model to the existing security standards is described.The proposed EDG model can be used isolated by itself, or in combination with other techniques that complement the analysis.In this sense, the EDG model can be used to enhance some task in the security evolution processes defined by security standards.
The ISA/IEC 62443-4-1 standard specifies 47 process requirements for the secure development of products used in industrial automation and control systems [62].Thus, the EDG model was developed to enhance the execution of one of those requirements defined by the standard: the "SVV-3: Vulnerability testing" requirement, serving as a support for the execution of Practice 5 -Security Verification and Validation testing.
According to the SVV-3 requirement, both known and unknown vulnerability analysis has to be performed.The EDG model proposed in this research work is intended to support the identification of known vulnerabilities, their dependencies, and the possible consequences of their propagation, yielding the opportunity to analyze them systematically.Nevertheless, more requirements of the ISA/IEC 62443 can be mapped to one or more of the metrics defined in this research work.Using this relationship, it is possible to apply the EDG model to enhance the analysis and review of the following requirements: 1) Security Requirements -2: Threat Model (SR-2): "A process shall be employed to ensure that all products have a threat model specific to the current development scope of the product.The threat model shall be reviewed and verified periodically" [62].The proposed EDG model can serve as an abstraction of the threat model that has to be obtained.Moreover, the standard states that this threat model has to be reviewed periodically for updates.Given that the EDG of a P R E P R I N T Arithmetic mean of vulnerabilities in the SUT A, where n(t) is the number of assets in a SUT at a time t.M0 shows how many vulnerabilities would be present in each asset if they were evenly distributed among the assets of the SUT.The result of M0 can serve as a preliminary analysis of the SUT, related to the criticality of its state.

M0 < 1:
The number of vulnerabilities is lower than the number of assets.M0 ≥ 1: Every asset has at least one vulnerability.

M1(A, t) = |CV EA(t)| Number of vulnerabilities in a SUT
A at time t.
Ideally, the values of M2 should be zero (no vulnerability in A), but the lower the value of M2, the better.
Number of vulnerabilities in a SUT A throughout its entire life cycle T .This metric computes the accumulated value of the number of vulnerabilities of a SUT throughout its entire life cycle.
The lower the value of M3, the better.
M3(ai, t) = |CV Ea i (t)| Number of vulnerabilities in an asset ak at time t The values of M3 can be useful during a vulnerability analysis, or when performing a penetration test, to identify the asset with more vulnerabilities.
Ideally, the value of M5 should be zero.
Relative frequency of vulnerabilities of the asset ak at a time t.
Ideally, the value of M6 should be zero, or at least M6 ≤ 1 n(t) , being n(t) the number of assets in the SUT.This value can also be expressed as the percentage of vulnerabilities of asset ai respect to the total number of vulnerabilities in the SUT, M6(ak, t) Multiplicity of weakness cwej of the asset ai at a time t.This metric represents the number of times a weakness is present among the vulnerabilities of the asset ai.This is possible because a vulnerability can have associated the same weakness as other vulnerabilities.
Ideally, the value of M7 should be zero, or at least, M7 ≤ , being n(t) the number of assets in the SUT.The value of the metric could be further narrowed by assuming that cwej will be present in all but one asset, so

CLUSTER NOTATION EXPANDED NOTATION
Asset a 3 is updated by a 4 and V 1 is patched Fig. 6: Representation of the temporal behavior in the graphical model using the two kinds of dependencies of the model.It is worth mentioning that these graphs could be further simplified by taking advantage of the cluster notation, as shown at the bottom of this figure.
given SUT evolves with every update, the threat model would be always up-to-date.Potential threats and their severity using the CVSS can also be analyzed with this proposal.Finally, these results can be used to enhance the risk assessment of the SUT.
2) Security Management -13: Continuous Improvement (SM-13): "A process shall be employed for continuously improving the secure development life cycle" [62].The EDG model can be used to identify recurrent issues in the development of an industrial component, due to its ability to track the state of a P R E P R I N T SUT over time.Consider the scenario where a piece of code contains an unknown vulnerability.For example, this code can implement a communication protocol, or the generation of a cryptographic key.If this piece of code is recurrently integrated in many type of devices, then when they are released to the market, the end users can identify that vulnerability and report it to the product supplier.The EDG can reflect the presence of that vulnerability.If an EDG is done for each type of device, then this problem can be detected beforehand.Using the CWE, the root problem can be detected.With this information, new training and corrective actions can be proposed to avoid this issue.
3) Specification of Security Requirements -5: Security Requirements Review (SR-5): "A process shall be employed to ensure that security requirements are reviewed, updated, and approved" [62].As before, taking advantage of the previous scenario, the information extracted from the generated EDG model can be used to propose new requirements or to update the existing requirements.4) Security Verification and Validation Testing -4: Penetration Testing (SVV-4): "A process shall be employed to identify and characterize security-related issues via tests that focus on discovering and exploiting security vulnerabilities in the product" [62].The EDG model facilitates the identification of possible entry points to the SUT when carrying out a penetration test.In addition, existing attack patterns (CAPEC) and weaknesses (CWE) can serve as a starting point to discover unknown vulnerabilities and exploits.5) Management of Security-related Issues -3: Assessing Security-related issues (DM-3): "A process shall be employed for analyzing security-related issues in the product" [62].When a new vulnerability is detected, end users will report it to the product suppliers.Then, the corresponding EDG model of that SUT will be updated to reflect that change.This information, in addition to that previously contained in the EDG, can be used to obtain the severity value of the discovered vulnerability using the CVSS.This also facilitates the identification of root causes, related security issues, or the impact.
Finally, the ISA/IEC 62443-4-2 document defines four types of components of an IACS (i.e., software applications, embedded devices, host devices, network devices) [94].The proposed model is capable of representing the inherent complexity of each of them.

IV. REAL USE CASE
In this section, the EDG model and the proposed metrics will be applied to perform a vulnerability assessment of the OpenPLC project, which will be the SUT.In the subsections, we will assess the three available versions of the OpenPLC project.For each, the EDG model will be obtained, and the proposed metrics will be applied to draw conclusions about the vulnerability status of each version.
OpenPLC is the first functional standardized open source Programmable Logic Controller (PLC), both in software and hardware [95].It was mainly created for research purposes in the areas of industrial and home automation, Internet of Things (IoT), and SCADA.Given that it is the only controller that provides its entire source code, it represents an engaging low-cost industrial solution -not only for academic research but also for real-world automation [96], [97].

A. Structure of OpenPLC
The OpenPLC project consists of three parts: 1) Runtime: It is the software that plays the same role as the firmware in a traditional PLC.It executes the control program.The runtime can be installed in a variety of embedded platforms, such as the Raspberry Pi, and in Operating Systems (OSs) such as Windows or Linux.Industrial Modbus slave devices can be attached to expand the number of inputs and outputs.2) Editor: An application that runs on a Windows or Linux OS that is used to write and compile the control programs that will be later executed by the runtime.3) HMI Builder: This software is to create web-based animations that will reflect the state of the process, in the same manner as a traditional HMI.
When installed, the OpenPLC runtime executes a built-in webserver that allows OpenPLC to be configured and new programs for it to run to be uploaded.

P R E P R I N T B. Setup Through the Analysis
The OpenPLC project is constituted by three different versions [98]- [100] as can be seen in Table IV.For this research work, Ubuntu Linux was selected as the platform to install the OpenPLC runtime.Ubuntu Linux provides comprehensive documentation, previous versions are accessible, and software dependencies can easily be obtained.
To make the analysis of OpenPLC fair, a contemporary operating system was selected for each of the OpenPLC versions, according to the version of Ubuntu that was available at the release time of each OpenPLC version (see Table IV).The Long Term Support (LTS) version was chosen, given that the industry tends to work with the most stable version available of any software and security updates are provided for a longer time.The scenario used for the analysis consists of OpenPLC installed on Ubuntu Linux in a virtual machine, following the OSs shown in Table IV.

C. Steps of the Analysis
The EDG model of OpenPLC was built following the steps described in Section III-A4 (see flowchart in Fig. 4).It is worth noting that these steps are followed for each one of the three versions available of OpenPLC, so an EDG is generated for each one.Obtaining the EDG for each version of OpenPLC will give information about the evolution of vulnerabilities over time.
For the sake of clarity and ease of analysis, the three obtained dependency graphs for each OpenPLC version were not merged into a single diagram.Fig. 7, Fig. 8, and Fig. 9, show the obtained EDG for versions V1, V2, and V3, respectively.In reality, these three diagrams would be the result of applying this method over time, updating the graph each time that a new vulnerability is discovered or a new patch/update is issued.

D. Analysis
In this stage, the analysis of the SUT is performed based on the generated EDG and the value of the computed metrics.This process can be structured into three main steps, as follows: For each of the previously described steps, the analysis is done in both the spatial and temporal dimensions: 1) Spatial Dimension: This kind of analysis focuses on the distribution of vulnerabilities among the assets at a time t.In this example, this corresponds to independently analyzing each obtained EDG, because each one represents an instant in time (a different version of OpenPLC).2) Temporal Dimension: This kind of analysis focuses on the evolution and distribution of vulnerabilities over time.In this example, this corresponds to analyzing the changes in the number of assets and vulnerabilities, and their distribution over all three EDGs (over all three versions of OpenPLC).1) Analysis of the Induced EDG Model: OpenPLC V1 is analyzed in this subsection, focusing on the internal structure and dependencies among the assets.
The first result of the EDG model is thegraph obtained for OpenPLC V1 (Fig. 7).From the spatial dimension point of view, assets depend on a main service, server.js,based in Java.This web server offers a web GUI for the user to configure, start, and stop the PLC execution.Below this level, the main components of OpenPLC can be seen: OPLC Starter (responsible to start the OpenPLC and constantly monitor if it is running or not), OPLC Compiler (compiler from ladder logic to ANSI C code), and openplc (initialization procedures for the hardware, network and the main loop).The other assets are dynamic libraries of the system, such as libstdc++, libm, textttlibc (for C programming), and textttlibssl (C library for SSL and TLS).
Moreover, the libc library is a wrapper around the system calls of the Linux kernel, which provides and defines system calls and other basic functions.Thus, it is expected that all of the identified assets depend on this library in every version of OpenPLC.Fig. 7 shows that this is indeed the case.
From the temporal dimension point of view, OpenPLC V2 has to be analyzed in comparison to the previous version of OpenPLC to find changes in the structure and the number of assets.Fig. 8 shows the EDG for OpenPLC V2.Comparing both OpenPLC V1 and OpenPLC v2, it can be noted that their structure is very similar.In OpenPLC V2, new assets were introduced to provide new features to the project: Matiec compiler (which is an open-source compiler for programming languages defined in the IEC 61131-3 standard), ST optimizer (which is responsible for the optimization process after the initial compilation from OpenPLC Editor), Glue Generator (which is responsible for gluing the variables from the IEC P R E P R I N T Fig. 7: EDG for OpenPLC V1.Notice that, for simplicity, CWE, and CAPEC values are omitted, and only the CPE identifier of the SUT is shown. program to the OpenPLC memory pointers), and OpenDNP3 (which is an implementation of the DNP3 protocol stack written in C++11).
In OpenPLC V2, and in comparison with OpenPLC V1, the OPLC compiler has disappeared and has been substituted by the Matiec Compiler, which supports all programming languages defined in the IEC 61131-3 standard.Moreover, the ST Optimizer and the Glue Generator were added to support the compilation process.Finally, the OpenDNP3 library has been added for high-performance applications, such as many concurrent TCP sessions.
Finally, and keeping the analysis in the temporal dimension, OpenPLC V3 can be compared to its ancestor, OpenPLC V2.Fig. 9 shows the EDG for OpenPLC V3.This last version of the project is the simplest.Now, the Java-based web server has been replaced by a Python-based web server, webserver.py.
The main components of this version are: Matiec compiler, ST optimizer, Glue Generator, OpenDNP3, and LibModbus.
2) Vulnerability Analysis: In this step, all three available versions of the OpenPLC project are analyzed from a vulnerability perspective, both in the spatial and temporal dimensions.The goal here is to use the defined metrics as a support of the analysis, obtaining the number of vulnerabilities, their distribution, and their severity score.Table V shows the values of the proposed metrics for all three versions of OpenPLC.Finally, a ranked prioritization by severity for the vulnerabilities of each version is proposed.
From the spatial dimension perspective, just by observing the generated EDG for OpenPLC V1 (Fig. 7), it can be stated that: of the vulnerabilities in this version.A priori, just by looking the number of vulnerabilities in each asset, it can be said that libssl is the most vulnerable asset in this version.Nevertheless, it is never enough to analyze the number of vulnerabilities (M 3 ), and the CVSS score for each has to be taken into account, as well as the existence of known exploits (e.g., updating the value of the CVSS using the temporal score).
• The other vulnerabilities in OpenPLC V1 affect the assets of the operating system, such as libc (M 3 = 9).• The ad hoc assets developed for this project have no known vulnerabilities.This does not mean that they are secure, but rather that there are no known vulnerabilities available for them.Zero-day vulnerabilities could be present in the SUT.
Values of M 1 = 91 suggests a high number of vulnerabilities in this project.This is expected given that OpenPLC V1 was the first version of OpenPLC.In the following versions, the value of M 1 is expected to decrease.
The most striking fact when observing the EDG for OpenPLC V1 is the large number of vulnerabilities that are present in libssl.From this perspective, if a pen-tester were to evaluate OpenPLC V1, libssl would be a promising target.When the values of the CVSS are checked, three vulnerabilities 2 have a score of 10.0 out of 10.0.As can be seen, EDGs are a powerful tool for inspecting the structure of the SUT, and they can be used to analyze how the exploitation of a vulnerability can affect the rest of the SUT.
All of the assets point to libc, which is a wrapper around the system calls of the Linux kernel.An evaluator, even without this information, can understand the importance of this dependency with libc using the EDG.On closer inspection, the highest score of CVSS for the vulnerabilities of libc is 9.3 out of 10.0 3 .If the exploitation of this vulnerability is possible, then it would allow a local user to gain privileges, exposing other assets of the SUT.
Using the value of severity of each vulnerability, it is possible to generate a list of vulnerabilities to be patched.This list can either be ordered by the global CVSS or by asset.Table VI shows all the vulnerabilities whose CVSS value is between 6.0 and 10.0 ordered by asset and by descending CVSS.This can be used to decide in which order the vulnerabilities have to be patched, both at asset level or at SUT level.vulnerabilities with a CVSS value of 10.0 are in libssl, which makes this asset the most vulnerable by number of vulnerabilities and by CVSS.These vulnerabilities should be a priority during the patching stages.It is worth noting that libc, which is an important asset in Ubuntu Linux, has a vulnerability whose score is 9.3 (CVE-2017-16997).This should also be a priority when patching.

From the information in
Moving now to OpenPLC V2, and comparing it with OpenPLC V1 (temporal dimension), similar conclusions can be drawn: • Most of the vulnerabilities are in third-party open-source components.Moreover, libssl (M 3 = 63) remains as the asset with the majority of vulnerabilities (with M 4 = 81.8% of them), followed by libz (M 3 = 4) from node-js.• The remaining vulnerabilities are related to the operating system, just as in OpenPLC V1, libc (M 3 = 5).
For this version, M 1 = 77, less than M 1 = 98 for the previous version.As was expected, the total number of vulnerabilities has decreased.From the spatial dimension, is worth noting that libssl is asset with number of a for a pen-tester an attacker.Nevertheless, the number of vulnerabilities for libssl (M 3 ) has decreased from OpenPLC V1 to OpenPLC V2, as it was expected.
As was done earlier, a list of vulnerabilities can be generated that is ordered by asset and by descending CVSS.Table VII shows all of the vulnerabilities for OpenPLC V2 whose CVSS value is between 6.0 and 10.0.     of data confidentiality and integrity.Frequently, these deal with the use of encoding techniques, encryption libraries, and hashing algorithms.The weaknesses in this category could lead to a degradation of the quality of data if they are not addressed." Repeating this same analysis with OpenPLC V2, we can see that the number of different weaknesses (M 7 = 18) is almost the same as before.But in this version, the most repeated weakness is CWE-2006 : "The product exposes sensitive information to an actor that is not explicitly authorized to have access to that information."Followed by CWE-119.
Comparing the previous versions with OpenPLC V3, we find now that the total number of weaknesses (M 7 = 2) has been reduced drastically.Nevertheless, weakness CWE-119 is still present in this version, and is the most repeated.
This analysis can also be done by combining the information of all three versions, drawing conclusions for the whole life cycle of OpenPLC.Requirements and test cases can be extracted from the most recurrent weaknesses, and training can be proposed to avoid weaknesses that repeat over time.In our example, CWE-119 has the greatest frequency over time (

CAPEC-14
Static analysis of the code: secure functions and buffer overflow.

CAPEC-24
Feed overly long input strings to the program in an attempt to overwhelm the filter (by causing a buffer overflow) and hoping that the filter does not fail securely (i.e. the user input is let into the system unfiltered)

CAPEC-45
This test uses symbolic links to cause buffer overflows.The evaluator can try to create or manipulate a symbolic link file such that its contents result in out of bounds data.When the target software processes the symbolic link file, it could potentially overflow internal buffers with insufficient bounds checking.

CAPEC-46
Static analysis of the code: secure functions and buffer overflow.

CAPEC-47
In this test, the target software is given input that the evaluator knows will be modified and expanded in size during processing.This test relies on the target software failing to anticipate that the expanded data may exceed some internal limit, thereby creating a buffer overflow.

CAPEC-59
This test targets predictable session ID in order to gain privileges.The evaluator can try to predict the session ID used during a transaction to perform spoofing and session hijacking.

CAPEC-97
Automated measurement of the entropy of the keys generated.

CAPEC-475
Fuzz testing to externally provided data, such as directories and filenames.
P R E P R I N T V. CONCLUSIONS AND FUTURE WORK Industrial components have become the driving force behind the development of modern industry.They are a key element in many sectors.They perform critical tasks and they process critical data.Following the introduction of connectivity, the use of COTS and open-source components, and the exponentially increasing complexity of industrial components, the number of threats to which industrial components are exposed has increased.These factors, together with the large lifespan of industrial components, call for a vulnerability assessment methodology that monitors the entire life cycle of industrial components.Existing security standards agree that vulnerability assessment is a critical task, but they usuallyonly consider software in their analysis, when both software and hardware should be taken into account.
The EDG model that is proposed in this research work constitutes a method to assess known vulnerabilities that is aligned with the ISA/IEC 62443 standard.EDGs have been specially developed to assess vulnerabilities in industrial components, but they can also be applied to any SUT that can be decomposed into assets.The key feature of the proposed model is the addition of the temporal dimension in the analysis of vulnerabilities.This makes it possible to analyze the location of vulnerabilities in both space (in which asset) and time (their recurrence), which allows the state of the device to be tracked throughout the whole life cycle.In addition, metrics constitute a plethora of information that that can be used by the model to improve the development process of the SUT, enhance its security, and track its status during its whole life cycle.The proposed EDG model has successfully achieved the integration of the following features: • Directed graph-based analysis model.
• Temporal evolution of the SUT, its assets, and its vulnerabilities.
• Quantitative metrics to measure the state of the SUT over time.
• Assess vulnerabilities in both software and hardware.
Moreover, with the information gathered by the model, it is possible to assist in update management activities, apply patching policies, launch training activities, and generate new test cases and requirements.This model can serve as a starting point for pen-testers, because it identifies vulnerabilities, weaknesses, and attack patterns for a given SUT.
Finally, the OpenPLC use case served as a demonstration of the advantages of the EDG model, its applicability, and the potential of the model.The generated EDG for all three OpenPLC versions serve as a visual tool for the evaluators, helping them to know the preliminary status of the SUT, before any further analysis.Furthermore, the proposed model is able to explore internal dependencies to understand how a vulnerability would propagate throughout the SUT.We also found that the main root causes for the vulnerabilities in OpenPLC are related to buffer management operations, and cryptographic issues.
With this data, we are able to propose new requirements, test cases, and training.Lists of vulnerability prioritizations for each version were also generated.
As a future work, the EDG model will be enhanced by adding a mathematical model to aggregate the values of the CVSS metric for each asset, and a value for the whole SUT.This would allow us to compare different SUTs over time.More improvements will be done in the prioritization of patching, taking into account the context and the functionalities of the SUT.Finally, historical information about the developers can be integrated into the EDG model to predict future vulnerabilities.

Fig. 1 :
Fig. 1: Basic elements of an EDG.Note that clusters are not displayed in this figure.For clusters, see Fig. 3.For metrics definition, see Section III-B.

Fig. 2 :
Fig. 2: Tracking dependencies between the previous and current CPE values for asset a.

Fig. 3 :
Fig.3: Creating clusters.Application of the two proposed criteria to the creation of clusters to simplify the graph: (1) Establishing a threshold to select which vulnerability stays outside the cluster (upper side).(2) Choosing the absence of vulnerability as the criterion to create clusters (lower side).The severity value (CVSS) for v 211 and v 212 is supposed to be lower than the establish threshold.

Fig. 4 :
Fig. 4: Algorithm to generate the initial EDG of a give SUT.

Fig. 5 :
Fig. 5: Example of the process of building the EDG model of a given SUT A.

•Fig. 8 :
Fig.8: EDG for OpenPLC V2.Note that for the sake of simplicity, CWE, and CAPEC values are omitted and only the CPE identifier of the SUT is shown.

5 Fig. 9 :
Fig.9: EDG for OpenPLC V3.Note that only the CPE of SUT is shown for the sake of simplicity.

Fig. 10 :
Fig. 10: Evolution of the number of vulnerabilities over consecutive versions of OpenPLC.P R E P R I N T

TABLE I :
Overview of the information that is necessary to define each of the EDG elements.Fig.2illustrateshow the tracking of the versions of an asset using CPE works.On the one hand, version a i is the current version of asset a.It contains its current CPE value, and the CPE of its previous version.On the other hand, a 2 and a 1 are previous versions of asset a.The last value of a 1 points to a null value.This indicates that it is the last value in the chain, and therefore the very first version of the asset a.

TABLE II :
Proposed metrics for the model.
Ideally, the value of M1 should be zero (no weakness in A), but the lower the value of M1, the better.Number of weaknesses in a SUT A throughout its entire life cycle T .This metric computes the accumulated value of weaknesses of a SUT throughout its entire life cycle.The lower the value of M4, the better.
M6(A, cwej , t) = |CV E A|cwej (t)| Multiplicity of weakness cwej of the SUT A at a time t.This metric represents the number of times a weakness is present among the vulnerabilities of the SUT A. Ideally, the value of M8 should be zero.WEAKNESSES M7(A, t) = |CW EA(t)| Number of weaknesses in a SUT A at time t.

TABLE III :
Mapping between the developed metrics and the requirements they refer in the ISA/IEC 62443.SR (Security Requirements), SM (Security Management), SVV (Security Validation and Verification), DM (Management of Security-Related Issues).

TABLE IV :
Versions and release dates of OpenPLC and the available Ubuntu Linux LTS at that time for each date.

1 )
Analysis of the induced EDG model: This step involves the analysis of the obtained directed-graph model.The structure, assets and dependencies are the focus of this first step.2) Vulnerability analysis: Vulnerability number, distribution, and severity are analyzed in this step, which is supported by metrics.A proposal for vulnerability prioritization is also proposed.3) Root causes analysis (weaknesses): Finally, the root cause of each vulnerability if found (related to the associated weakness).In this step, new requirements, test cases, and training activities are proposed based on the results of the analysis.
Table VI, it is clear that all the three

TABLE V :
Metric values for each asset and version of OpenPLC.Notice that "CWE-NULL" refers to a void value of CWE for a certain CVE value.

TABLE VI :
Vulnerability prioritization by asset and by CVSS for OpenPLC V1.Only the vulnerabilities whose value is between 6.0 and 10.0 are shown.

TABLE VII :
Vulnerability prioritization by asset and by CVSS for OpenPLC V2.Only the vulnerabilities whose value is between 6.0 and 10.0 are shown.

TABLE VIII :
Vulnerability prioritization by asset and by CVSS for OpenPLC V3.Only the vulnerabilities whose value is between 6.0 and 10.0 are shown.

TABLE IX :
Weakness analysis for all versions of OpenPLC.
Table X shows the generated requirements, Table XI shows the proposed training, and Table XII shows the proposed test cases for OpenPLC.

TABLE XI :
Example of proposed training for OpenPLC.

TABLE XII :
Example of generated test cases for OpenPLC.Check for buffer overflows through manipulation of environment variables.This test leverages implicit trust often placed in environment variables.