A Gate-Level Information Leakage Detection Framework of Sequential Circuit Using Z3

: Hardware intellectual property (IP) cores from untrusted vendors are widely used, raising security concerns for system designers. Although formal methods provide powerful solutions for detecting malicious behaviors in hardware, the participation of manual work prevents the methods from reaching practical applications. For example, Information Flow Tracking (IFT) represents a powerful approach to preventing leakage of sensitive information. However, existing IFT solutions either introduce hardware overheads or lack practical automatic working procedures, especially for hardware sequential logic. To alleviate these challenges, we propose a framework that fully automates information leakage detection at the gate level of hardware. This framework introduces Z3, an SMT solver, to automatically check the violation of conﬁdentiality. On the other hand, an automatic tool is developed to remove the manual workload further. In this tool, the gate level hardware is converted to the formal model ﬁrstly, and the integrity of the model is assessed. Along with the model converting step, the property for leakage detection is generated as well. The proposed solution is tested on 25 gate-level netlist benchmarks, where sequential designs are included to validate the effectiveness. As a result, Trojans leaking information from circuit outputs can be automatically detected. The measured time consumption of the entire working procedure validates the efﬁciency of the proposed approach.


Introduction
The demand for intellectual property (IP) cores has significantly increased owing the changing landscape of the semiconductor industry.The proliferation of the IP market is affected by various factors such as lowered design cost, shortened time-to-market (TTM), etc.In the meantime, the credibility of third-party vendors is threatened by the hardware Trojan and design flaws, which also places high-security uncertainties on the IP end-users and customers.In a system-on-chip (SoC), a malicious IP core can bypass many existing hardware Trojan detection methods [1,2].
In detecting hardware Trojans and vulnerabilities, formal methods have been most effective among all the existing techniques [3][4][5][6][7][8][9][10][11].However, very few current formal verification approaches are scalable and practical for hardware Trojan detection in the industry due to the lack of automatic and efficient tools.For instance, model checking is a popularly used malicious logic detection method for protecting third-party IP cores [10].In the model checking, security properties are formalized as traces, and all possible traces generated by the system are checked.The system is said to satisfy the security property if all the traces pass the checking [12].However, checking the very large system, the model checker always runs into the state space explosion issue.
Information flow tracking (IFT) [13] is a scalable approach for detecting leakage/sneaky path of sensitive information.In IFT, data or operations are assigned by labels standing for the trust levels.The labels are propagated or updated to other data relying on the information flow policy.In general, data with labels are accessed or propagated to the trust portion in the system.Some IFT-based solutions on assuring hardware security are proposed such as SecVerilog [14,15], Caisson [16], Sapper [17], QIF-Verilog [18], CELLIFT [19], gate-level information flow tracking (GLIFT) [20], etc.However, there is a lack of IFT solutions in detecting sneaky paths in the gate-level net-list.Theorem provers such as Coq are utilized in proving the gate-level information flow property [21].However, as a theorem proving method, a significant manual effort is required for constructing machine proofs.SecChisel [22] applies an automated formal verification checking using the Z3 solver [23], but it only provides protections during high-level synthesis.
We propose a framework for formalizing and checking gate-level hardware design for security purposes to solve those problems.In the framework, gate-level net-list data files are parsed to the formal model in the form of constraints.Then sensitive labels are introduced to denote secrets in the hardware design.If outputs are tainted by the labels, the information leakage is detected.Satisfiability modulo theories (SMT) solver is utilized as the checking engine to propagate the information flow and automatically check IFT policies.
The main contributions of this paper are as follows.

•
We introduce an automated formal verification framework detecting vulnerabilities in the gate-level net-list.The net-list data is formalized into a circuit model, and then security properties are designed based on the model.Confidentiality is enforced on the input hardware design by applying an automatic checking engine.

•
GLIFT is, for the first time, statically applied in the gate-level hardware with a fully automated working procedure.The information leakage is addressed and localized by tracking sensitive information.

•
An automatic tool, including a parser and a SMT Solver, is designed and demonstrated based on the developed framework.The parser for translating net-list files to formal models can support multiple net-list process libraries such as generic gate Verilog and 180nm CMOS library.An algorithm is developed to support analyzing the properties in sequential circuits using an SMT solver.

•
The proposed framework can accurately analyze the sequential information flow, and its effectiveness is proven in 25 benchmarks, including combinational circuits and sequential circuits.
The rest of the paper is organized as follows.In Section 2, we introduce the threat model and discuss previous work on malicious logic detection using IFT based solutions and then present a gate-level IFT model.We explain our automated framework involving the SMT solver and code parser in Section 3. Section 4 introduces the tool we designed for the automatic framework.Section 5 presents demonstrations of our approach.The limitations of this work are discussed in Section 6.Finally, conclusions are drawn in Section 7.

Attack Model
This paper assumes that information leakage paths are created by either intended hardware Trojan or unintentional design errors.An adversary can insert malicious logic at the design or testing stage in the supply chain.We assume that the rogue agent at the thirdparty IP vendor can access the Register-transfer Level (RTL) design or the gate-level netlist files and then insert a hardware Trojan or backdoor to create a sneaky path in the design.Lacking security knowledge, the hardware developers could produce vulnerabilities, such as leakage paths, in the design stage.On the other hand, we assume that attackers can access inputs and outputs ports of the manufactured hardware and have knowledge of the hardware functionality.Therefore, by triggering the Trojan or observing the input-output patterns, the attacker can exploit such information leakage paths to infer the sensitive/secret information of the design.

Related Work
Formal methods have been proven that it is the most effective and complete method in ensuring the security of integrated circuits.However, because of the lack of automated and security-oriented tools, very few formal methods are applied to hardware security verification.Moreover, some of these works are manual and dynamic, making the verification process time-consuming.The related works are summarized in Table 1.Recently, IFT based security approaches for protecting confidentiality are delivered in the form of a language-based solution.Caisson [16] and Sapper [17] realize IFT isolation and separation properties and in the synthesized secure circuits.In Caisson or Sapper, wires and registers are duplicated in generated hardware, which introduce considerable hardware overheads at the circuit level.SecVerilog avoids the hardware overheads by detecting information leakage in the compilation stage [15].It extends the type system of standard Verilog to enforce noninterference in the design.However, a complex security label system is needed by SecVerilog to increase precision.Only with sufficient knowledge of security, the circuit designers can specify information flow policies in SecVerilog.In contrast, QIF-Verilog only extends one simple security label from the standard Verilog to reduce the cost of learning from the developers' side [18].It quantifies the information leakage by applying the quantitative information flow tracking in the design stage.However, the QIF-Verilog is not capable of supporting IFT analysis in the gate-level netlist.CELLIFT [30] provides a dynamic information flow tracking method for hardware.It leverages the logical macrocell abstraction to achieve scalability, precision and completeness in RTL design.However, its performance is limited by the amount of logic cell types.
In [31], GLIFT is proposed to detect malicious logic by tracking the information flow in the runtime hardware.It models logic gates and labels individual bit at the gate-level.The information flow propagation logic is realized in hardware along with the original functional circuit, though with high hardware overheads [19].A static GLIFT approach is proposed in [21] which checks security property in the gate-level netlist.It translates the property and the netlist to theorems and formal circuits, respectively.The theorem proving is utilized to prove the satisfaction of the property against the formal circuit.Using an interactive proving approach, developers manually construct the proofs, which increases the time required for certifying large hardware design.SecChisel is proposed in [22] to check the confidentiality and integrity of hardware design automatically using the SMT solver.Based on the Chisel hardware construction language, the SecChisel verification framework converts a higher level hardware description to the intermediate representations, FIRRTL representations, and then parses them to Z3 inputs for the information flow checking.Although the framework checks the IFT property automatically, it focuses on the high-level synthesis procedure rather than the gate-level netlist, not to mention that Chisel has not been widely adopted in industry.Refs.[24,32] propose a unified formal model which combines IFT Taint-propagation and X-propagation to verify the security and integrity of the hardware design.This work realize efficient model building for multiple property verification.However, it causes a large simulation overhead because of the extra tracking logic in RTL code.
In our preliminary work [33], GLIFT approach is used to detect information leakage combined with an SMT solver Z3.It translates the original circuit and extra IFT logic from the net-list file to a static formal model.The property is designed based on the privacy of information propagation.Specifically, the security labels of sensitive input and output ports are set high to evaluate whether the sensitive information can be propagated to output.Then the model and property are input into the Z3 SMT solver for tracking information flow.This work realized an automatic framework for GLIFT model translating, property generating, and leakage path solving.However, the framework can only handle the combinational circuit.There is a demand for supporting sequential logic as the design of hardware becomes more and more complex.

Modeling Gate-Level IFT
An advantage of GLIFT is that each data bit is associated with a security label, which propagates labels more precise and reduces false-positive rates [20].As an example, in Equation ( 1), we perform and operation between the secret signal and a 32-bits zero vector, then output the result.
where AND-2 function performs as a 32 bits two-inputs AND operation and Secret has been labelled as high sensitive.In the traditional IFT approach, the sensitive label would be propagated to the output port and then detected as information leakage.However, as the other signal involved in the and operation is zero, no secret is actually leaked through this operation, which causes a false-positive.
In the GLIFT, both signal value and security labels are taken into consideration during the label propagation.Rather than tracking the data flow in the original design only, how the output is influenced by input values must also be accounted for.To achieve this goal, extra logic gates are created to represent the influence along with the original circuit.We use a two-input AND logic gate as the example.For each two-input AND gate, the extra logic gates are inserted as shown in Figure 1.The A and B are 1-bit input while O is the 1-bit output.Accordingly, labels for A, B and O are denoted as A t , B t and O t .Following the structure, once the low sensitive input is 0, the output label O t keeps 0 no matter what the other high sensitive input value is.Only in the case that the low sensitive input is 1, the O t is influenced by the high sensitive input, which means that the highly sensitive label has the potential to propagate to the output.
For sequential circuits, when the clock edge comes, the signal in circuit can be propagated through sequential logic gate.The sensitive label should also be propagated at the meanwhile.However, previous sequential information flow researches don't always obey the theory elaborated above.Take D-flip-flop as an example, Figure 2 shows the logic and sensitive propagation rule of DFF cell in previous research.When the clock positive edge comes, the signal of port D propagates to port Q.The sensitive label D t is propagated to port Q t without any sequential constraint.There is no problem to apply this DFF model in dynamic information flow tracking methods.Because in dynamic information flow tracking methods, extra circuits for IFT logic are added.The circuit implementation of IFT logic guarantees the correctness of information propagation.However, in static information flow tracking method, static model of IFT logic is generated directly.There is no additional practical circuit implementation.Thus, the sequential synchronization of information propagation must be ensured in the formal model so that the complete information leakage path with sequential property can be detected.

SMT Solver
Satisfiability (SAT) solvers have been used in many electronic design automation (EDA) fields such as logic synthesis, verification, and testing.The SAT solvers are originally designed to solve the well-known Boolean Satisfiability problem, which decides whether a propositional logic formula can be satisfied given value assignments of the variables in the formula.Based on SAT solver, SMT solver is derived by including several first-order theories, such as arithmetic, bit-vectors and quantifiers [23].However, due to the high computational complexity, there is no hardware implementation for SMT solvers, and the software-based SMT solver is not scalable to large designs.Z3 is a popular used SMT solver providing efficient verification and analysis applications [23].It is assembled in the Python environment as Z3PY, which is a convenience for developing practical tools [34].

Methodology
The proposed framework automates the formal verification by realizing IFT in the gate-level net-list design.Following our preliminary work in [33], it converts the whole hardware design to Z3 constraints and adds extra logic to track security labels.Label checking will be performed in an SMT solver.In this paper, Z3 is utilized as the solver.Therefore, a parser is developed to translate the net-list to its equivalent Z3 constraints along with the extra GLIFT logic generations.

Framework Overview
The working procedure of the proposed formal framework is shown in Figure 3.The gate-level net-list data is input to a parser, where the original hardware design is parsed to its formal equivalent representations, called the functional circuit representations F. In the meantime, the parser further generates extra logic gates to introduce and track security labels.Those logic gates are denoted as IFT circuit representations I, which compose the GLIFT logic.Both representations F and I are in the form of Z3 constraints.Hence we define the formal model M as Equation (2).Taking the logic gates in Figure 1 as an example, signal {A, B, O} are composed following constraints in F while signal {A t ,B t ,O t } are composed based on constraints in I.The corresponding procedure of deriving formal model M is shown as follows.
where & stands for the and operation and | stands for the or operation.M is the model, which input to the Z3 platform.IFT properties are denoted as P, indicating sensitive data bits.Input to the Z3 solver, P is in the form of Z3 constraints as well.The constraints C, which need to be checked in the end, are conjunctions of M and P. Taking the circuit in Figure 1 as as example, we assume that B is of high sensitivity while A is in low sensitivity.
It leads to label value 1 in B t and label value 0 in A t .If the output O leaks sensitive information, then we will have O t of label 1.We can derive the C as follows.
Z3 SMT solver is then utilized to check C. If there is no solution, whatever the inputs are, there is no path to propagate the high sensitive label to the output O t .The design is highly secure regarding the confidentiality property.Otherwise, the high sensitive label can be propagated to the output port and observed by the attacker by giving the solution as input.In this example, the solutions {A = 1, B = 0} and {A = 1, B = 1} are obtained by the Z3.Therefore, the design in Figure 1 has information leakage paths.

Sequential Split Strategy
We propose a sequential split strategy to solve the timing synchronization problem of the original circuit information flow and the extra circuit information flow, as mentioned in Section 2.3.First, we analyze the RTL hardware program, named RTL code, before circuit synthesis.Figure 4 shows an example of the result of the sequential logic synthesis (right side) in the RTL code (left side).When the structure shown in Figure 4 appears in the code, there will be a series of DFFs in the net-list after synthesis.To avoid the sequential synchronization problem mentioned in Section 2.3, we split the RTL code into two partsbefore and after the sequential statement code.Each part will only contain combinational logic code statements.After the split, every individual part will be synthesized into net-list data.That is, there is no net-list sequential code block or logic cell inside an individual part, as shown in Figure 4.Such individual Net-list file can be translated to the formal model in Z3 by using our automatic parser.Then, all this connection information is added into the model to formal the model of the whole circuit.In the end, the complete model contains several parts, where every part represents the circuit design logic and GLIFT logic of each clock cycle.

Tool Design
We developed an automatic tool for security verification.It first translates the net-list to Z3 model/constraints, then checks the model's integrity.After that, the time label is added to every submodel represented as individual parts in Section 3.2.Along with the model establishment step, the property based on GLIFT theory is generated as well.The tool is written in Python, and the structure is shown in Figure 5. Every block in the tool structure is introduced as follows.

Net-list to Z3 Parser
We developed an automatic parser for converting and generating Z3 constraints from the gate-level net-list.The parser is written in Python and has the structure shown in Figure 6.There are two parts in the parser-code analysis and code generation.The code analysis part interprets the net-list file.It generates wires and registers that are utilized in the functional circuit.Especially, the input and output signals are extracted from those wires/registers, which are assistant to the following property design and model integration.Then in the code generation, the functional circuit representations F and IFT circuit representations I are produced.As a result, rely on the extracted inputs and outputs, F and I are integrated for the Z3 solver.

Integrity Checking Module
An integrity-checking module is designed to ensure the integrity of the model.To handle sequential circuits, we split the circuit as discussed in Section 3.2.Accordingly, the models are composed of separated sub-models.Therefore, the connection signals between cascaded parts of circuits and corresponding IFT signals are declared especially.In addition, the types of logic gates that make up the net-list are shown up.The result is compared with the logic gate library utilized in the Net-list to Z3 parser code generation.If any type of logic gate in net-list does not appear in the preset logic gate library, errors will be reported, guaranteeing the model's integrity.

Sequential Label Setting Module
We label the model with timing tags to make the model more specific and the leakage path clearer in the sequential aspect.An ergodic algorithm is applied to label every variation in the model timing.For example, the variation N3 at the first clock cycle is transformed to N3 T 1 after this procedure.The functional circuit representations F and IFT circuit representations I are transformed to F T and I T as the output of this module.

Property Generation Module
As one of the essential parts of formal verification, we set two property generation methods in this part according to the GLIFT theory.The premise is that the IFT logic of security sensitive signals is set as high.One of the two theories is to set an OR gate for all the IFT logic of the output signals.Then it checks if a solution can result in a high logic at the output of the OR gate.The other is to set the logic value of the IFT logic of the suspect output port as high and then find if a solution can satisfy this condition.We choose one of these two methods to generate property according to the characteristics of the experimental circuit.

Experiments
This section demonstrates the proposed information flow tracking based automated formal verification.The experiment is set up in Python environment and evaluates IFT property in Verilog net-list benchmarks.Trojans are inserted into the genuine benchmarks, while properties are designed as adding labels in IFT circuit representations.

Experimental Setup
To use the proposed framework in practical applications, a developer/user only needs to indicate high-sensitive bits in the IFT circuit representations' input signals and those outputs observable by attackers.In the experiment, some specific data bits are treated as secrets, and confidentiality is checked for those labeled secrets.
The main tool utilized for experimentation is Z3 SMT Solver.API of Z3 has been assembled in the Python environment as Z3PY.The Z3 solver is in the same environment as the net-list to Z3 parser, which makes the toolchain be integrated easily.We employ the Z3 to check if the tainted label of secret information can be delivered to the IFT circuit's output.All the demonstrations are executed in Windows 10 on a computing machine with Core(TM) i3-9100 CPU(manufactured in Intel Corporation, Santa Clara, CA, USA) @3.60 GHz and 8 GB memory.
To demonstrate the practicality of our proposed framework, we evaluate 22 ISCAS'85 gate-level net-list benchmarks [35,36].Those benchmarks are written in Verilog and have been synthesized using Cadence Genus.They provide combinational logic circuits to let users test different mythologies.To fit the attack model in this paper, we insert the leakage paths to simulate hardware Trojans into the design.In addition, we choose one-round AES circuit as the benchmark to prove our framework's practicality in sequential circuits.Furthermore, hardware trojans, customized based on Trojans in Trust-Hub, are inserted to establish sneaky information-leaking paths.Then, the net-list to Z3 parser translates the Trojan inserted benchmarks to models in Z3, while the IFT logic of the benchmark is generated simultaneously.After that, we establish the solver and add constraints standing for IFT properties.The model and properties are finally checked together in the Z3 platform.

Leakage Paths Checking
In Figure 7, we show a template of inserted hardware Trojan design.All Trojans in our ISCAS85 benchmarks follow this structure and will leak information about the circuit.Specifically, the inserted hardware Trojans are combinational circuits composed of AND, NAND, and NOR gates.The trigger of the Trojan is connected to the input ports and would be activated by a specific input pattern.The Trojan payload enables an AND gate and passes the sensitive information to the output ports.For each combinational benchmark, We denote one data bit in a specific input as the secret and set its label as high.For sequential benchmark, we consider the 128-bit key signals as the secret information and label all the 128-bit signals as high.Output bits that the Trojan influences are defined as vulnerable output ports.The security property is represented as "Assigning the high sensitive label to a secret and low sensitive labels to the rest signals, whether there exists at least one solution causing high sensitive label appeared on vulnerable output ports".In other words, if the Z3 finds a solution, then the Trojan is detected.As a counter-example, the solution is the input vector that propagates secrets to outputs.

Results and Analysis
Table 2 shows the results of our experiments.Again, Trojans are inserted into all the benchmarks for leaking information.Among those benchmarks, the Trojans in "memetrl" and "div" are always on, while the others are triggered by a signal.We account for the number of logic gates from the net-list data design files in the column of the functional gate, and the number of GLIFT gates from the formal model in the column of IFT gate.The gate number in IFT logic is 3 − 10× more than functional gates.It indicates the huge area overheads would be caused if we implement the GLIFT logic in real hardware circuits.The time consumption of parsing Verilog net-list to Z3 constraints is listed as model time.Time cost in Z3 solving is listed as the detection time.The column of total time indicates the time consumption from taking in benchmarks to detecting hardware Trojans.Taking the benchmark c6288 as an example, the c6288 includes 2416 gates, from which 9364 GLIFT logic gates are generated by the parser.The time consumption of code parsing and generation is 134 ms.The Z3 solving takes 1085 ms to detect the hardware Trojan.Assuming that the security property has already been designed, the total time cost for detecting Trojan in c6288 is 1219 ms.
We can see that the model time of sequential benchmarks (AES1-T1, AES1-T2, and AES1-T3) are disproportionately more than other benchmarks.Because of the difference in process library, the functional gates of sequential benchmarks are more complex than combinational benchmarks.Thus, the number of functional gate of original circuits is proportionally less than that of combinational benchmarks.Moreover, for sequential circuits, we label every variation in net-list with timing tags to observe which clock the variation is in at the whole sneaky path.The timing tag labeling leads to much more model time consumption in sequential circuits than in combinational circuits.As a result, all Trojans in those benchmarks are detected successfully.
The largest benchmark in this experiment is div which includes 100, 985 functional gates.The total time spent on the security verification is 320, 778 ms or around 5 min.From the results, the evaluation can be finished in minutes.The proposed formal framework is efficient for protecting the confidentiality of the gate-level net-list.
Further, leakage paths can be obtained by analyzing results, which can help developers improve their designs.Figure 9 demonstrates the leakage paths detected in the benchmark c432.The signal N17 is the secret input signal that is tainted and the signal Tj-payload is the output of the Trojan.In this example, we detected 167 leakage paths in the gate-level net-list while 3 of them are shown in the figure.Developers could improve the secure level by adding obfuscation on those paths.

Limitations and Discussion
Although the proposed framework demonstrates excellent performance in detecting sneaky paths of information leakage, there are some limitations, such as proof of a very large-scale circuit and the need for a fully automatic sequential logic process algorithm.SMT solving is often efficient in obtaining a result where a solution exists.However, it becomes an NP-hard problem once there exists no solution to the given problem/constraints.Mapping to our framework, it demonstrates a significant performance in detecting sneaky paths in the condition that a Trojan or vulnerability exists.The solution searching strategy can be optimized to improve efficiency further.
In contrast, if there are no such paths leaking secrets, the solver must check all possible cases before termination, which leads to intense computation complexity.To address this issue, we will set a threshold according to the size of the net-list file.The SMT solving would be terminated in the threshold and report a compromised secure checking to users.
Our parser can currently support combinational and sequential circuit logic parsing and generation.However, the sequential logic needs to first be manually handled and then transformed into a formal model and verified.In the future, we will perfect our framework to be fully automatic and support large-scale circuit verification.

Conclusions
This paper proposes a formal framework to protect the confidentiality of hardware design at the gate-level.By designing a parser, the formal model is generated and composed of a functional circuit and GLIFT logic circuit.The Z3 solver validates the model with an IFT property in the end.Moreover, a sequential split algorithm is proposed to guarantee the beingness of the verification result.The framework provides a fully automatic static formal verification from the input net-list file to the IFT property checking.In the future, an automatic sequential circuit-processing module will be added to the framework.Accordingly, larger scale benchmarks with hardware Trojans will be tested.Furthermore, we will extend the framework to cover more properties and features.The integrity property will be considered to identify malicious modifications.

Figure 3 .
Figure 3. Working procedure of the proposed formal framework.

Figure 4 .
Figure 4. Synthesis results of sequential block in RTL code.Then these individual net-list files are associated together in a cascaded way.As a pipeline, each circuit part's input and output signals are extracted as the connection of two cascaded modules in the adjacent clock cycles.The logical relationship between the output of modules in the previous clock cycle and the modules' input in the next clock cycle is declared.At the same time, the shadow logic of the above relationship is declared either.Then, all this connection information is added into the model to formal the model of the whole circuit.In the end, the complete model contains several parts, where every part represents the circuit design logic and GLIFT logic of each clock cycle.

Figure 5 .
Figure 5. Structure of the designed tool.

Figure 6 .
Figure 6.Block structure of the developed parser.

Figure 7 .
Figure 7. Design of inserted sneaky paths in ISCAS85 benchmarks.

Figure 8 Figure 8 .
Figure8shows the trojan we design for AES-T3.Once the input matches the preset value, the Trojan trigger outputs a high signal value.Then the Trojan payload is activated to leak the secret information, the Key of AES.

Table 1 .
Summary of formal methods in hardware security.

Table 2 .
Tests on Trojan insertion benchmarks.