SC-DDPL as a Countermeasure against Static Power Side-Channel Attacks

: With the continuous scaling of CMOS technology, which has now reached the 3 nm node at production level, static power begins to dominate the power consumption of nanometer CMOS integrated circuits. A novel class of security attacks to cryptographic circuits which exploit the correlation between the static power and the secret keys was introduced more than ten years ago, and, since then, several successful key recovery experiments have been reported. These results clearly demonstrate that attacks exploiting static power (AESP) represent a serious threat for cryptographic systems implemented in nanometer CMOS technologies. In this work, we analyze the effectiveness of the Standard Cell Delay-based Precharge Logic (SC-DDPL) style in counteracting static power side-channel attacks. Experimental results on an FPGA implementation of a compact PRESENT crypto-core show that the SC-DDPL implementation allows a great improvement of all the security metrics with respect to the standard CMOS implementation and other state-of-the-art countermeasures such as WDDL and MDPL.


Introduction
Advances in semiconductor technology have been extremely beneficial for the development of new trends in electronics, as modern integrated circuits provide better performance, lower power consumption and much higher integration level. Along with advances in semiconductor technology, security features are increasingly required, especially in applications where sensitive data are handled (e.g., healthcare devices, smartphones, credit cards and IoT nodes). Apart from classical mathematical methodologies to break security schemes, in 1996, Kocher et al. formalized a new approach to recover secret keys from a physical implementation without bruteforcing the algorithm in a seminal paper [1]. The work of Kocher et al. gave rise to a new branch of research in the context of security, known as Side-Channel Analysis (SCA), which became one of the pillars of physical security in a broader sense. In SCA, an adversary would not leverage mathematical weaknesses of a cryptographic algorithm to recover sensible data, while making use of physical emissions, or side channels, directly from the device, such as power consumption [2], execution time [1] and electromagnetic emission [3]. The rationale behind SCA is based on the fact that those physical emissions can be related to the data processed within the device. Clearly, in the last two decades, SCA has become a serious threat in security and a critical issue for the designers of cryptographic hardware, therefore becoming a focal point in the development of secure products.
In the era of nanometer CMOS technologies, the role of the leakage currents of MOS transistors resulting in static power dissipation has become increasingly important in the power balance of digital integrated circuits, especially when referring to ultra-constrained applications. In addition to the technological problem due to static power consumption, an important security issue was formulated and addressed by Alioto et al. in 2010 [4]. In this context, it is important to notice that the static power of CMOS integrated circuits strongly depends on the input vectors, and the exploitation of the correlation between input data and static power paved the way for a new class of side-channel attacks. Conventional power analysis leverages dynamic power consumption, which in the past has been historically dominant in CMOS integrated circuits. Many countermeasures have been proposed against conventional power analysis attacks, mainly summarized into two categories: hiding, where the dynamic data-dependent consumption is hidden within physical and algorithmic noise, and masking, where random and independent data are processed along the real ones in order to mask the overall physical observation from the exploitable ones. The first category aims to reduce the exploitable signal compared to noise, while the second one aims to increase the algorithmic noise that the adversary can measure/observe. Clearly, in digital integrated circuits design flow, hiding and masking countermeasures can be applied at different abstraction levels. A popular way to reduce the data-dependent component of dynamic power consumption at gate-level is based on the adoption of Dual-rail Pre-charged Logics (DPLs). DPLs aim to balance the switching activity factor for each gate in a design in order to deploy an overall power consumption that is independent from processed data by means of differential gates and (differential) Return-to-Zero (RTZ) encoding. Common DPLs that have been investigated in the literature to decorrelate the power consumption from manipulated data include Wave Dynamic Differential Logic (WDDL) [5], Masked DPL (MDPL) [6], Dynamic and Differential Swing-Limited Logic (DDSLL) [7] and Sense Amplifier Based Logic (SABL) [8].
The aforementioned logic styles are theoretically secure, but, in practice, their effectiveness is strongly limited by technological issues concerning a balanced routing assumption that is hardly reachable in practice. In fact, in the presence of electrical mismatches in the capacitive (and resistive) load of complementary outputs of DPL gates, their effectiveness in counteracting power analysis attacks is heavily reduced [9], as well as in the presence of tight routing constraints. Their applicability to sub-micron technologies is worsened by the increasing and dominant role of routing in the capacitive contribution and almost impossible in FPGA, where the possibility to effectively balance the routing is strongly reduced compared to ASIC. Bucci et al. [10] proposed the Delay-based DPL (DDPL) as a power analysis-resistant logic style that is insensitive to imperfect routing. The DDPL makes use of an alternative data encoding protocol, called Time Enclosed Logic (TEL) [11]. In the TEL protocol, the datum represented by a pair of complementary wires is encoded in their time of arrival in the 0 → 1 transition, which, in other words, corresponds to the time difference of the two wires in reaching V DD . As this time difference decreases, the effect of unbalanced capacitive loads is pushed towards high frequencies, so that it can be easily cut-off by means of on-chip power-rail decoupling capacitances. An improved implementation of DDPL (namely, iDDPL) is proposed in [12], where the effectiveness of the TEL protocol in thwarting conventional power analysis attacks even in presence of imperfect routing is demonstrated on a prototype 65 nm CMOS ASIC. Both DDPL and iDDPL require a full-custom design, which inevitably impacts on the development cost of a secured hardware macro. Recently, a novel TEL-compliant standard-cell logic style has been proposed, namely Standard-Cell Delay-based DPL (SC-DDPL) [13]. SC-DDPL is insensitive to routing unbalance and offers a portable solution to protect cryptographic circuits against power analysis including on FPGA platforms. Bellizia et al. [13] reported a remarkable ability of the SC-DDPL to withstand conventional power analysis attacks based on the exploitation of dynamic power consumption.
The idea that TEL protocol may provide protection against Attacks Exploiting Static Power (AESP) techniques, as this encoding scheme strongly reduces the time interval in which data can leak, was introduced for the first time by Bellizia et al. [14], and a preliminary study of the capability of SC-DDPL to withstand AESP was presented by Bellizia [15].
In this work, we analyze in detail the resilience of the SC-DDPL against AESP, providing a twofold contribution:

•
We validate the robustness of the TEL protocol to static power analysis, according to Bellizia et al. [14]. • We validate the SC-DDPL as an effective countermeasure to this kind of attacks by means of a full set of experimental results on a reprogrammable device.
The paper is organized as follows. In Section 2, a review of AESP and the related evaluation methodologies is presented. In Section 3, we briefly recall the TEL protocol and describe the SC-DDPL operation principles and circuits. Experimental results in which AESP were carried out on several FPGA implementations of a compact PRESENT crypto-core are reported and discussed in Section 4. Finally, conclusion are drawn in Section 5.

Review of Attacks Exploiting Static Power
In this section, we briefly recall the AESP as an approach to recover sensible information from a nanometer CMOS implementation of a cryptographic circuit exploiting its static power consumption as a source of information leakage. With the aggressive scaling of MOS transistors channel lengths in the nanometer regime, several second-order effects such as reverse bias-pn junction leakage, sub-threshold leakage, drain-induced barrier lowering and threshold voltage (V TH ) roll off are no longer negligible and result in a continuous increase of static current dissipation of CMOS integrated circuits (ICs). Since the static current is correlated to the input data of CMOS cryptographic ICs, it can be considered as an additional side channel that an attacker can exploit to infer the secret keys. Among several physical effects, sub-threshold leakage currents represent the most dominant one in modern deep-scaled technologies (<100 nm). The sub-threshold leakage current I sub of MOS devices depends on several factors and a comprehensive model of this current is given by [16]: where W and L are the gate width and length, respectively; V TH is the threshold voltage; and V GS , V DS and V SB denote the gate-source voltage, the drain-source voltage and the source-body voltage, respectively. K, η and γ are technology-dependent constants and V T is the thermal voltage defined as follows: where k is the Boltzmann's constant, T is the absolute temperature and q is the electrical charge of the electron. As observed in [17], the magnitude of the static current of a circuit can be correlated with the data processed within the circuit itself. Research activity dealing with the relationship between the static power consumption of CMOS ICs and their input data has been triggered, at the beginning of this century, by the necessity to reduce the static power consumption of microprocessors implemented in nanometer CMOS processes. These studies have found that there is a strong relationship between the value of the static power dissipation of a digital CMOS circuit and the input state of the different CMOS logic gates upon which it is built. Starting from 2007, several studies have shown that the correlation between the static power consumption and the input vectors of cryptographic ICs can be exploited by a novel class of attacks firstly denoted as "Leakage Power Analysis Attacks" and then renamed as "Attacks Exploiting Static Power" (AESP) [14]. AESP are able to infer the secret keys through the exploitation of the correlation between the static power of a CMOS implementation of a cryptographic algorithm and its input data, which can be controlled by a malicious attacker. Most hardware implementations of cryptographic algorithms are based on bit-sliced circuits, in which the static power of the whole circuit can be computed as the sum of the static powers of the m-bit slices. The static power of each bit-slice is assumed to be equal to the high (or low) value of the static power P H (or P L ) according to the value '1' or '0' of the corresponding input bit [4]. Then, remembering that the number of '1's in the input vector represents its Hamming Weight (HW) w, the static power consumption of a m-bit slice is easily found to be: The above model expresses the linear dependency of the static power of a slice on the HW of its input vector [18,19].
Leveraging on the linear relationship between the static power and the Hamming Weight of input vectors of a circuit, an AESP makes use of the Pearson's correlation coefficient to exploit the data-dependency and recover sensible information. The correlation coefficient of a key guess k using l static power samples is computed as follows: where H i,k is the Hamming Weight of a intermediate function of the input with index i and key guess k (e.g., the output value of the XOR between key and input plaintext), H i is the average Hamming Weight value, P stat,i is the static power due to the input with index i and P stat is the average static power value.

Threat Model
In the context of AESP, it is common to assume that the adversary has the ability to control the clock signal and, in particular, to stop it freely (see Figure 1). This assumption is widely used in the literature, as recently discussed by Moos [20]. Moos noted that this assumption may not be needed in all cases, and it could be sufficient that the target intermediate variable is stable for some time (it is reported as a "certain number of cycles") to be exploited, even in the presence of algorithmic noise. Therefore, we can safely assume that performing experiment with such strong assumption would lead to a worst-case analysis from the security point of view.

Related Works on AESP
After a hiatus of some years, the hardware security community has recently started again to investigate on AESP and to analyze security issues due to this particular side channel. Djukanovic et al. [21] discussed the role of temperature as a dimension to increase the informativeness of static current, through simulated experiments at various temperature, also considering a multivariate approach. In [22], Moradi demonstrated the feasibility of AESP attacks over three different FPGA technologies, also adopting a protected AESbased crypto-core. These results were extended by Moos [20], presented at CHES'19, to different ASIC technologies, giving also an insight on vulnerabilities of Boolean masked implementations to AESP. Another paper presented at CHES'19, by Karimi et al. [19], proposes a study of the impact of device aging on the exploitability of static power, based on measurements taken on 65 nm prototype ASICs. This study shows that the degradation of an integrated circuit due to aging effects reduces the informativeness of static currents, hence increasing the effort for their exploitation. In [18], Moos et al. proposed a survey of the impact of several measurement factors on the outcome of an AESP, such as temperature (as in [21]), power supply voltage and integration time, providing results on a 150 nm ASIC prototype chip.
It has to be noted that the analysis of DPLs as gate-level countermeasures against attacks exploiting static power was only investigated from a simulated perspective by Bellizia et al. [14]. In addition, none of the available TEL-compatible logics have been studied experimentally under this point of view.

Security Metrics
In accordance with previous works [23,24], we compliment our AESP attacks with a leakage assessment based on the adoption of the Signal-to-Noise Ratio (SNR) proposed by Mangard [25], which allows an intuitive quantification of the side-channel signal strength against the observed noise. SNR is computed as the ratio between the variance of the data-dependent component of the power consumption σ 2 data , in this case static power, and the variance of the observed noise σ 2 noise : Usually, the SNR in static power measurements is much lower compared to dynamic ones. Therefore, we make use of a logarithmic transformation of the SNR for practical reasons: In our AESP evaluation, we use also an information theoretic approach, leveraging the concept of mutual information (MI). MI [26] is based on Shannon's conditional entropy and quantifies the amount of information leaked by the hardware implementation under test, considering the side channel as a noisy channel. It is defined as follows: where H[X] is the entropy of the secret variable (key) X, Pr(x) is the probability of the secret variable x ∈ X and Pr chip (l|x) is the probability of the leakage l given the secret x. Clearly, Pr chip (x|l) can be derived using the Bayes's theorem. In our analysis, we use the Gaussian model assumption on the distributions, as, in previous observations, it was found that this model is sound and therefore applicable. In addition to SNR and mutual information, we also perform the Test Vector Leakage Assessment (TVLA) [27] based on Welch's t-test on our static power analysis of the SC-DDPL. The t-test is a statistical test that allows verifying if two classes of samples A and B belong to the same population by means of comparing their first-order statistics, also called null hypothesis. The t-test is widely diffused in the SCA literature, and it has recently been intensively used also for AESP leakage assessment. Usually, a threshold value of the t-test score |t| > 4.5 (the value of 4.5 is used to ensure a statistical confidence higher than 0.9999) means that classes A and B do not belong to the same population (thus, rejecting the null hypothesis), remarking the presence of data-dependent leakage in the side-channel samples. The t-test score can be evaluated as follows: where L A (respectively, L B ) represents the mean value of the static power samples of Class A (respectively, Class B), σ 2 A (respectively, σ 2 B ) represents variance of Class A (respectively, Class B) and N A (respectively, N B ) is the cardinality of Class A (respectively, Class B).Classes A and B can be partitioned according to a fixed versus random approach [27] or with the fixed versus fixed one [28].

TEL Protocol and SC-DDPL
In this section, we briefly recall the TEL protocol and describe the SC-DDPL operation principles and circuits.

TEL Protocol
In the last two decades, Dual-rail Pre-charge Logics (DPLs) have been widely studied as gate-level countermeasures against conventional power analysis attacks. Most DPLs, such as WDDL, SABL and MDPL, are based on the Return-to-Zero (RTZ) protocol for encoding data processed within a secure circuit. In the RTZ protocol, a clock cycle is divided into two phases, called pre-charge and evaluation phases. During the pre-charge phase, both wires of a dual-rail pairs are pre-charged to a known value (0 or 1), while, during the evaluation phase, they assume the informative value 0/1 or 1/0 (see Table 1). Ideally, the presence of a complementary value on a dual-rail pair would allow achieving data-independent power consumption. However, this goal is hard to be reached in practice, as RTZ-based DPLs require perfect capacitive load balance. One of the main causes of capacitive unbalance is the routing between secure gates, which is usually hard to balance in deep-scaled technologies. Especially on FPGAs, where designers do not have a full control on layout results, this effect is particularly critical for the effectiveness of RTZbased DPLs in counteracting power analysis [9]. In addition, some DPLs, such as WDDL, make use of inherently asymmetric combinational gates, providing an additional source of unbalancing, also under a static power point of view.
In order to cope with these issues, the Time Enclosed Logic (TEL) protocol was introduced in 2011 [10] and later formalized in 2015 [11]. The TEL protocol encodes data in the time of arrival of the two wires in a dual-rail pair, as shown in Figure 2. The clock period T CK is divided into three phases: • Pre-charge phase (t pre ): Both wires are pre-charged to 0. • Evaluation (t eval ): One of the wires reaches V DD before the other, according to the representation of the data in Figure 2 and Table 1. It is straightforward to observe the following: In particular, the evaluation phase length is very critical from a security perspective. In fact, to reduce the information leakage due to capacitive unbalance, t eval has to be kept really short. In other words, This condition is deeply investigated in [11,13]. In fact, as the evaluation phase is reduced, the effect of capacitive unbalance in terms of information leakage's frequency content is moved to high frequencies (in the range of hundreds of MHz). From a designer perspective, this effect allows the information leakage to be filtered off (e.g., by low-pass effect of the on-chip power grid's parasitic capacitance or by explicit on-chip filtering). The value of t eval depends on several factors, such as the technology node, design requirements and security constraints. Of course, the value of t eval depends on several factors. It is evident that the value of this parameter is technology dependent, as newer technologies can support shorter evaluation phase, due to shorter propagation delay. However, this value is also design-dependent, as it is limited by the pipeline architecture, according to the time enclosed principle introduced in [11] (typical values for t eval are in the range of hundreds of picoseconds to nanoseconds, in order to shift the informative leakage frequency in the range of hundreds of MHz to GHz) and by security requirements.

SC-DDPL Operation
Recently, Bellizia et al. proposed the Standard-Cell Delay-based Dual-rail Pre-charge Logic (SC-DDPL) as the first standard-cell based logic style that can support the TEL protocol [13]. In order to be compliant with the TEL protocol, the SC-DDPL has been designed to fulfill the completeness property [29]. Inspecting the Table 1, we can clearly observe that in a TEL circuit the dual-rail information is encoded as a mutually exclusive value in asserted domain; thus, only one wire of the pair is asserted during the evaluation phase. Moreover, the NULL value is used to guarantee that the Boolean logic is symbolically complete, which implies the completeness of the set. This completeness is relevant as it allows the synchronization of the output signals of a TEL gate [29]. In order to guarantee the completeness property, each gate has to be designed according to the following requirements: It has to be noted that each signal of a gate has to perform the '0' to '1' and the '1' to '0' transitions in the same clock cycle. According to Tiri and Verbauwhede [5], Bellizia et al. [13], the latter requirement for security and TEL-compliance concerns F 1 and F 2 , which have to be positive monotonic. The NAND operator is able to satisfy all these requirements, and it has served as base function to design each SC-DDPL combinational gate. For example, the AND/NAND function can easily be derived from Equation (11) adopting a product-of-product approach: Equations (12) and (13) represent the equivalence of non-minimal AND and NAND, respectively, built upon a two-stage template. It has to be noted that a straightforward implementation in static CMOS gates of Equations (12) and (13) would lead to an asymmetric architecture of the secure SC-DDPL gate (details in [13]), and therefore Equation (12) is adapted as follows: With Equations (13) and (14), the AND/NAND function in SC-DDPL is obtained using standard-cell NAND2 and NAND3 gates, as depicted in Figure 3. It is easy to observe that from a power analysis perspective such construction allows a good balancing. Other combinational gates can be derived easily. It has to be noted that this architecture does not require any custom logic, as only static NAND2 and NAND3 gates are used (usually available in all design kits). Moreover, this architecture is easily portable to FPGA, as it is possible to design it using only LUTs. This feature represents a strong point compared to DDPL/iDDPL. As a remark, we must notice that the SC-DDPL, as any other TEL-compliant logic styles, requires the use of a second clock, namely CKD (see Figure 2), which is a delayed version of the nominal clock signal CK. Its delay corresponds to the nominal t eval , and it is distributed to input converters and flip-flops. Such delayed replica can be easily generated on-chip, reducing the attack surface to a stronger SCA adversary.

SC-DDPL Effectiveness against AESP
SC-DDPL was studied and tested along with other DPLs from a dynamic power exploitability viewpoint by Bellizia et al. [13], exhibiting stronger resilience against conventional power analysis attacks compared to other logic styles referring to both 40 nm CMOS simulations and real experiments on FPGA platforms. Bellizia et al. [14] discussed about potential vulnerability issues of DPLs regarding AESP. More precisely, mutual information analysis as a function of noise standard deviation has shown that MDPL and WDDL (both standard-cell based DPLs) exhibit a leakier behavior than static CMOS, remarking that only the full-custom SABL approach outperforms the CMOS unprotected implementation. Bellizia et al. [14] reported only a preliminary discussion about the possibility of using TEL-based logic styles as countermeasures against AESP. As an additional remark, we observe here that, since the TEL protocol encodes the information in the difference of the arrival time of two complementary wires, no potential leaks can be observed in the static power consumption if the clock is stopped. Being SC-DDPL based on the TEL protocol, we focus on the investigation and analysis of its resilience against AESP.

Experimental Results
In previous works, the possibility to use the TEL encoding as a countermeasure against AESP has only been suggested, and no experimental or simulated results are provided to support this claim. In this section, we report first experimental results on the AESP resilience of a TEL-compliant SC-DDPL FPGA implementation of a compact PRESENT crypto-core, alongside with a comparison with other RTZ-based DPLs.

Case Study -4-bit PRESENT Crypto-Core
As a case study, we considered the 4-bit cryptographic core in [13], as shown in Figure 4a, based on a nibble slice of the first round of the PRESENT algorithm [30]. PRESENT is a lightweight block cipher designed for constrained applications (e.g., RFID tags and IoT nodes) and part of the ISO/IEC 29192-2:2012 standard. The circuit implements the 4-bit XOR between the input plaintext and the key, an instance of the PRESENT 4-bit SBOX and input/output registers. We implemented four different cores, based on CMOS logic, WDDL, MDPL (the MDPL core requires randomness that we generated by means of a linear-feedback shift register, used as Pseudo-Random Number Generator (PRNG)) and SC-DDPL, in order to compare their resistance against AESP and extend the analysis in [13] to the static power domain. Clearly, input converters were added in the protected versions of the 4-bit core to allow interfacing the unsecured static CMOS domain with the secured circuitry. As Device Under Attack (DUA), we used an Intel Cyclone-IV FPGA (65 nm technology). Along with the cryptographic module, we implemented an UART interface on the FPGA, in order to deploy a communication channel with the external environment. The CKD clock replica required by the SC-DDPL was generated by the on-chip PLL and set to 4 ns.

Measurement Setup
In accordance with Bellizia et al. [31], we used a measurement setup based on the utilization of a picoammeter in place of the conventional digital storage oscilloscope, widely used in the literature to carry out attacks exploiting dynamic power. Adopting such strategy allows a direct current measurement, bypassing the need of voltage-to-current conversion, amplifiers and low-pass filters [18], hence simplifying the setup itself while avoiding unwanted distortions. As picoammeter, we used a Keithley 6485, which is designed to perform high-precision static current measurement.
In previous works, it has been demonstrated that performing experiments at higher (than room) temperature leads to more informative measurements [18,21]. Therefore, we adopted the same heating system used in [31], setting the working temperature of the DUA at 65 • C. The power supply voltage of the DUA's core was set to 1.2 V by using a bench-top power supply.
The measurement setup was controlled by a Matlab script, which provides/reads input/output data to/from the target and collects static current samples from the picoammeter. The target handled the triggering sequence of the Keithley for performing the measurement using a trigger handshake (TRG_I N and TRG_OUT signals) with the instrument. The trigger handshake was partially implemented on a control FPGA (i.e., an Intel Cyclone-II development board). A block scheme of the measurement setup and a timing diagram of the triggering sequence is depicted in Figure 5. As also suggested in [18], the integration time (also shown in Figure 1) was set as a trade-off between speed, in terms of sample collected per unit of time and noise. In our experiments, a good trade-off was found for a 0.25 power-line cycle, corresponding to 5 ms of integration time. The encryption operation takes place in three clock cycles, being the first one consumed only for loading the input plaintext and key into the core. Therefore, only the second and third cycles are meaningful for our analysis, as they are the two involved in the real cryptographic operation. Adopting the threat model in Section 2.1, and therefore assuming that the adversary has full control on the main clock signal CK, static current values were recorded for each value of the stable clock signal (CK = '0' and CK = '1') in the third cycle (see Figure 4b), hence collecting two samples of the static current. In order to perform the attack, the output nibble of the XOR operator was chosen as the target function. Then, for each value of the stable clock signal (CK = '0' and CK = '1'), 64 k values of static current were acquired for the countermeasure cores, whereas only 16k values were acquired for the CMOS core.

Leakage Assessment
The first analysis that we carried out is related to the comparison between the MDPL core and the unprotected CMOS core. The results of our evaluations referring to the evaluation phase of the MDPL core with PRNG disabled show that it exhibits an SNR about equal to the one of the CMOS implementation, whereas the mutual information of the MDPL core was found higher than the one of the unprotected CMOS core. These results are in partial agreement with those of Bellizia et al. [14]. Then, we compared all the countermeasure cores in terms of SNR and mutual information, showing that the SC-DDPL core exhibits the best values for both these metrics compared to other countermeasures. In particular, we found that the SNR is about 10 dB lower than the one of the CMOS core, whereas it is at least 5 dB lower than those of MDPL and WDDL cores. The SC-DDPL core exhibits a mutual information of 3.18 × 10 −3 bit, which is one order of magnitude lower than the mutual information of the CMOS and MDPL (with PRNG off) implementations. All the above results (summarized in Table 2) confirm the claims reported in [14]. As most of the considered case studies are non-masked, we performed the fixed versus fixed t-test, as suggested in [28], to find the evidence of leakage through this widely adopted assessment methodology. The results of the t-test analysis as a function of the number of traces are reported in Figure 6a for CK = '0' and Figure 6b for CK = '1'. The CMOS implementation is reported as a reference for comparison against protected cores. WDDL and MDPL show a meaningful t-test score already with a limited number of traces on the case with CK = '0', as the threshold value of 4.5 is exceeded at ∼10 k, as the CMOS in the same setting. The SC-DDPL does not show any meaningful leakage in both cases with CK = '0' and CK = '1'. The value at 64 k traces is 0.4 for CK = '0' and 1.42 for CK = '1'. These results are partially in line with the findings in [14] concerning RTZ-based DPLs, and they show how the TEL encoding is able to suppress the informativeness of the leakage through static power analysis.

AESP Results
The outcomes of the AESP attacks are reported in Figure 7 and summarized in Table 3 in terms of measurements-to-disclosure (MTD) for both CK = '0' and CK = '1'. The AESP attacks succeeded in retrieving the secret key only for the CMOS and MDPL implementations. In particular, the correct key of the MDPL implementation was recovered only with CK = '1' and the PRNG enabled. A possible explanation of this result can be found noting that, in the RTZ countermeasures, during the evaluation phase, the inputs of both the combinational gates and the flip-flops are set to their evaluation values. This is similar to what can be observed in the CMOS implementation, in which the inputs are always set to their evaluation values. Therefore, in all the above cases, there is a strong correlation between the input data and the static current which can be exploited by the attacker. The results of the AESP on the MDPL countermeasure (see Figure 7c) also demonstrate that the adoption of the PNRG in the MDPL is not able to counteract the effectiveness of the AESP.    As shown in Figure 7e and Table 3, the SC-DDPL implementation was not broken, even with the maximum number of measurements, confirming its capability to withstand AESP.
A further analysis of AESP outcomes can be carried out by considering the Success Value Indicator (SV I), defined as the difference between absolute correlation coefficient values of the correct key ρ corr and wrong key that exhibits the highest (absolute) correlation coefficient ρ wrong . To achieve a fair comparison among different implementations with different absolute values of the correlation coefficients, we also adopted the normalized SV I % , defined as follows: SV I % is a useful metric to quantify the effective reduction of the value of the statistical distinguisher allowed by a given countermeasure. As shown in Table 3, the SC-DDPL approach results in a SV I % of −213.92% for CK = '1' and −36.71% for CK = '0', thus outperforming other DPLs. Another important result achieved by the SC-DDPL implementation is that it exhibits the lowest correlation coefficient (|ρ corr | = 0.0079) among all other implementations.
The motivation behind these good results can be found by noting that, when the attacker stops the CK signal (and consequently stops also CKD), all combinational gates are in the pre-charge or post-evaluation phase, thus assuming the same input and output states regardless of the value of CK. Therefore, the mutual information leaked by SC-DDPL combinational gates is theoretically zero. The only information that can be reliably extracted by measuring the static power of SC-DDPL implementation is related to storage elements (e.g., flip-flops) of the system. However, the symmetric architecture of the SC-DDPL storage elements allows minimizing the information which they can leak through static power. More specifically, the presence of two symmetric and homogeneous branches for each sub-block in the flip-flop and the fact that, for each possible TEL state of the input data, the number of transistors that are switched off/on is the same for each value of CK also make the information leakage through static power of the flip-flops ideally zero.

Discussion
According to the leakage assessment and AESP outcomes, we can observe that the TEL-compatible SC-DDPL outperforms all considered RTZ-based DPLs. These results confirm the intuitions in [14] regarding the ability of TEL-compatible logics to withstand a static power adversary. The possibility to confine the information leakage in a (very) short time window that is not under control of the adversary along with intra-gate homogeneity allows designing a secure circuit that is not vulnerable to static (nor dynamic) power analysis. The outcome of this investigation suggests that TEL encoding is a valuable approach in securing circuits from SCA, also considering an adversary that is able to control the main clock. Even considering the latest findings of Moos [20], if we relax the needs of full control on the clock signal from an adversarial perspective, we expect that SC-DDPL, and in general TEL-compatible DPLs such as iDDPL, would not exhibit any meaningful informative leakage as static power consumption.
A summary of the resource utilization of the crypto-cores is reported in Table 4. It is straightforward to notice that the SC-DDPL requirements are in between those of MDPL and WDDL, making it suitable for many area-constrained applications where security against dynamic and static power analysis is required. Table 4. Utilization of the various implementations of the PRESENT crypto-core on the Intel Cyclone-IV FPGA and design characteristics. Note that the average static power consumption is reported for the whole core power supply, and thus not used resources also contribute to the overall absorption.

Conclusions
In this paper, we carry out and in-depth analysis of the effectiveness of the TELcompliant SC-DDPL logic style as a countermeasure against static power side-channel attacks. Experimental results referring to a 4-bit PRESENT crypto-core implemented on an Intel Cyclone-IV FPGA device show that the proposed SC-DDPL countermeasure outperforms standard-cell and RTZ-based dual-rail logic styles, namely WDDL and MDPL. The evaluation was carried out by means of AESP attacks and thorough leakage assessment, adopting SNR, mutual information and t-test. The SC-DDPL implementation showed a strong improvement of all the security metrics compared to reference RTZ-based DPLs and the ability to withstand AESP attacks. These experimental results confirm that the TEL encoding represents a suitable approach in counteracting AESP at gate-level, offering an additional level of protection to this less investigated (but not less important) side channel.

Data Availability Statement:
The data presented in this study are available in article.

Conflicts of Interest:
The authors declare a conflict of interest in regards of Itamar Levi, as he is a frequent collaborator.