A Fault Attack on the Family of Enocoro Stream Ciphers

: A differential fault attack framework for the Enocoro family of stream ciphers is presented. We only require that the attacker can reset the internal state and inject a random byte-fault, in a random register, during a known time period. For a single fault injection, we develop a differential clocking algorithm that computes a set of linear equations in the in- and output differences of the non-linear parts of the cipher and relates them to the differential keystream. The usage of these equations is two-fold. Firstly, one can determine those differentials that can be computed from the faulty keystream, and secondly they help to pin down the actual location and timing of the fault injection. Combining these results, each fault injection gives us information on speciﬁc small parts of the internal state. By encoding the information we gain from several fault injections using the weighted Horn clauses, we construct a guessing path that can be used to quickly retrieve the internal state using a suitable heuristic. Finally, we evaluate our framework with the ISO-standardized and CRYPTREC candidate recommended cipher Enocoro-128 v2 . Simulations show that, on average, the secret key can be retrieved within 20 min on a standard workstation using less than ﬁve fault injections.


Introduction
Besides numerous other uses, stream ciphers play a central role in mobile telecommunications. Essentially every hand-held communication device contains a hardware implementation of one or more stream ciphers. Since these devices are readily available to a potential attacker, there is an urgent need to secure those hardware implementations against side-channel attacks targeting the secret keys stored on the device and the keystream produced from them. Therefore it is an important task to analyze the security of current stream ciphers and stream cipher proposals against various types of side-channel attacks, both of the passive type (measuring power consumption, measuring electromagnetic radiation) and of the active type (generating power glitches, laser injections, mechanical disturbances).
In this paper we examine the Enocoro family of stream ciphers developed by Hitachi Ltd. (see [1]). These stream ciphers are contained in the Japanese government's CRYPTREC recommended ciphers list (see [2]) and have been standardized by ISO in ISO/IEC 29192-3:2012 (see [3]). The Enocoro family consists of the stream ciphers Enocoro-80 (see [4]) and Enocoro-128v2 (see [1,2]). Their construction is inspired by the PANAMA hash function and stream cipher (see [5]). The security of Enocoro stream ciphers against various types of attacks has been considered in [6][7][8][9][10][11]. For instance, in [9], a slide attack is proposed, and [10] suggests a guess-and-determine attack which, despite its name, is unrelated to the attack studied here. Side channel attacks on the family of Enocoro stream ciphers have not been studied intensively. Only [7] briefly discusses two types of passive side channel attacks, namely Differential Power Analysis (DPA) and Correlation Power Analysis (CPA). Active side-channel attacks, in particular fault attacks, apparently have not yet been analyzed in this context. Notice that the core part of our attack relies on a guess-anddetermine mechanism which has to adapt to the results of previous fault injections. For this reason, the guess-and-determine attack strategies presented in [6,10,11] cannot be adapted to the fault attack setting in a straightforward way and a completely new approach had to be developed in Section 5.
In general, fault attacks are a well-researched topic (see [12][13][14][15][16][17][18][19]). Their main characteristic is the set of assumptions made concerning the capabilities of the attacker: physical capabilities, computing power, or prior knowledge. So, let us start by specifying our Fault Model: The proposed fault attack has byte-sized fault injections at its core. In addition, we are assuming the setting of a known plaintext attack. Since the bit stream of the plaintext is added to the keystream, this means that we are in possession of the keystream. Moreover, we assume that we can reset the internal state, typically by restarting the keystream generation with the same secret key and the same initialization vector. Of course, we do not assume the knowledge of these values, just the possibility of resets. As for the fault injections themselves, we require only a very mild spatial resolution, namely the injection of a byte fault into an unknown register, as well as a very mild temporal resolution, namely the possibility to time the injection within a small time period, i.e., a small number of clock cycles.
Feasibility: For an actual physical realization of our attack, these assumptions are not particularly hard to fulfill. Given physical hardware access, the first two assumptions can be reduced to the setting of a known plaintext attack and the assumption that we can convince the hardware device to take a certain initialization vector as an input. For the actual injection of faults, there are several different techniques available. These include the usage of electromagnetic or laser pulses, over-and under-clocking of the circuit, and voltage drops. For extensive studies on these techniques and their limitations, we refer to [20][21][22]. Practical implementations of much more demanding fault models have been reported in [23][24][25]. Based on these results, we believe that the injection of byte faults into unspecified registers on an FPGA-implementation is readily achievable with current technology.
The proposed fault attack framework consists of several parts. First we introduce a differential clocking algorithm. By comparing the correct keystream with the corrupted keystream, i.e., by considering the differential keystream, it allows us to determine the precise injection point and to find differentials in certain register values. Then the task is to use these differentials to derive other register values until we are able to completely reconstruct the internal state of the cipher at one point in time. Clocking the keystream back in time, we are then able to reveal the secret key and the cipher is broken.
So, how do we use several differentials in register values to deduce the full internal state? The construction of the cipher allows us to write down equations connecting the bits of the register values and the key stream bits. Then we can make deductions of the following kind: If we know registers R at time t and R at time t , we can deduce the value of R at time t .
(Here "register" has to be replaced by the "keystream" in some cases.) All of these deductions are phrased as propositional logic formulas, more precisely, as Horn clauses. Since there is an efficient algorithm for checking the satisfiability of a set of Horn clauses, called the Marking Algorithm, we can quickly decide if our current knowledge suffices to deduce the entire internal state or not. In actual fact, we may not know the exact value of some register at some point in time, but only a (small) set of candidate values. Then our deduction technique has to be refined as follows. We assign each Horn clause a weight corresponding to the amount of information we have about it, i.e., how large the candidate sets of values are. Then we use a suitably adjusted version of the Marking Algorithm to check if we can reduce the candidate set for the internal state to a manageable size, where "manageable" means that we can find the correct internal state by a small enough exhaustive search. Furthermore, we extend the scope of this procedure by allowing the guessing of certain register value bits (thereby increasing the search space, of course), and finding the optimal guessing path heuristically. In other words, our attack tries to model and follow the information flow during the execution of the stream cipher, and it tries to optimize the information gain which can be derived from each fault injection.
Does this approach work? For the Enocoro family of stream ciphers, it seems to work extremely well. A successful fault attack on Enocoro-128v2 is possible using approximately five fault injections on average, where the fault is assumed to be injected anytime during the first 10 clock cycles. The ensuing computation requires approximately 20 min.
Contents. Now let us describe the course of the paper in somewhat greater detail. In Section 2 we recall the definition of the family of Enocoro stream ciphers. In particular, we provide the mathematical equations defining the clock and the update functions. In the next section we use this description to make our fault model, and the assumptions underlying it, very explicit. We also introduce some common terminology for fault attacks. The topic of Section 4 is the Differential Clocking Algorithm 1. This algorithm allows us to track differentials, i.e., input and output differences, through a clock cycle. More precisely, it returns a set of linear polynomials that relate distinct differentials with one another and with the keystream output. With these linear polynomials we can trace a generic fault through the internal states and the keystream output on a cycle-by-cycle basis. One of its first uses is the possibility to locate the injection point of a fault injection up to a small number of possible cases (see Propositions 2 and 3). After determining the injection point, we can then try to find as many input and output differences as possible. Concrete results for the cipher Enocoro-128v2 are given in the last part of this section.
In Section 5 our attentions shift to combining several fault injections in an optimal way. The basic strategy is to perform a Guess-and-Determine (GD) Attack. We describe logical implications between the knowledge of various register values by Horn clauses. When dealing with fault injections, often the acquired information is not exact, i.e., instead of the precise register value, only a small set containing that value can be found. As indicated above, this is modelled by assigning a weight to the corresponding Horn clause. Now the order in which the registers can be computed, represented by their corresponding Horn clauses, is called a (weighted) guessing path. The weighted clauses here are obtained by carefully tracking how the fault injections propagate through the internal state, and how the various differential in-and outputs that occur due to the faults are related to one another. The Guessing Path Attack Algorithm 2 allows us to find the initial state of the cipher if the guessing path has just small enough weight. Since finding an optimal guessing path is not feasible, we propose a Greedy Marking Algorithm 3 as a heuristic to find a guessing path of small weight.
These techniques are put to work in Section 6 where we apply them to construct actual fault attacks on Enocoro stream ciphers by describing an Enocoro Fault Attack Framework in Algorithm 4. In the final section we execute this framework in actual simulations and report on the timings and results. Using approximately five fault injections and about 20 min of calculations, we can expect to retrieve the internal state and break the cipher Enocoro-128v2.
Unless stated otherwise, we use the definitions and notation in [26,27]. The algorithms of Sections 4 and 5 were implemented by the first author using the computer algebra system CoCoA-5 (see [28]).

Description of the Enocoro Stream Cipher Family
This section summarizes the relevant aspects of the family of Enocoro stream ciphers as specified in [1]. We adhere to the definition of a stream cipher, as in [29]. Let n b ∈ N and k 1 , k 2 , k 3 , k 4 , q 1 , p 1 , q 2 , p 2 , q 3 , p 3 ∈ {0, . . . , n b − 1} with p i − q i = p j − q j for i, j ∈ {1, 2, 3} and i = j. Now the stream cipher Enocoro(n b ; k 1 , k 2 , k 3 , k 4 , q 1 , p 1 , q 2 , p 2 , q 3 , p 3 ) is defined as follows: (a) The cipher consists of the registers R = {b 0 , . . . , b n b −1 , a 0 , a 1 } and the value of register R ∈ R at time t ≥ 0 is denoted by R (t) ∈ F 8 2 . The internal state at time t is the tuple of the values of the registers at that time and denoted by: The first n b entries are also called the buffer, and the remaining two coordinates are known as the state.
for the buffer registers. Notice that restricted to this part, the update function is linear. For updating the remaining two registers, one defines an invertible linear function L : (F 8 2 ) 2 → (F 8 2 ) 2 and a non-linear S-Box s 8 : F 8 2 → F 8 2 whose precise construction is not relevant for our attack and is omitted here. However, the following relation is important: Observe that clock as a whole is an invertible function, i.e., it is possible to retrieve the so-called initial state S (0) from any internal state S (t) .
(c) The output function, denoted by keystr, is linear and given by keystr(S (t) ) = a (t)

.
(d) The general specification does not include an explicit description of the initialization function init that takes an initialization vector (IV) v ∈ IV and a secret key k ∈ K and returns the initial state S (0) . Usually this function is invertible, and heavily based on the next-state function clock.
From this general framework, two specific parameter sets were derived, for 128-bit security [1] and 80-bit security [4]. Those are both standardized in ISO 29192-3 [3] (along the well-studied cipher Trivium [30]), and the 128-bit version is part of the Candidate Recommended List of the Japanese CRYPTREC.
Example 1 (Enocoro-128v2 [1]). For 128-bits of security, we let n b = 32, k 1 = 2, k 2 = 7, k 3 = 16, k 4 = 29, p 1 = 6, p 2 = 15, p 3 = 28, q 1 = 2, q 2 = 7, and q 3 = 16. Then the internal state consists of 34 bytes. The initialization function is efficiently invertible and based on 96 applications of the next-state function clock. It takes a key in K = (F 8 2 ) 16 and an IV in The next-state function of Enocoro-128v2 is also shown schematically in Figure 1. From now on we assume that init is efficiently invertible. Thus the used key and IV can be derived quickly from the initial state S (0) (and any other internal state S (t) where t is not too large). The fault attack of this paper computes the initial state S (0) of a given Enocoro stream cipher instance. In the last section, results from simulations and experiments with the 128-bit version of Enocoro are presented.
In the following we fix a key k ∈ K and an IV v ∈ IV and consider the stream cipher S = Enocoro(n b ; k 1 , k 2 , k 3 , k 4 , q 1 , p 1 , q 2 , p 2 , q 3 , p 3 ) with an (efficiently) invertible initializa-tion function. We also denote the sequence of internal states by (S (t) ) t≥0 and the keystream output by (z (t) ) t≥0 .

Our Fault Model
Our fault attack is based on a differential fault analysis which is ultimately reduced to a close inspection of the differential keystream. For this, we clearly need to have access to the keystream at all times. Assumption 1. The keystream is accessible, i.e., we are in the setting of a known plaintext attack.
In practice, this assumption can be slightly weakened as we do not need to have the whole keystream available. The simulations in Section 7, for instance, never used more than the first 40 keystream words. In order to be able to combine information of different fault injections, it is necessary that they convey information about the same sequence of internal states.

Assumption 2.
The internal state of the cipher can be reset to the initial state, e.g., by restarting the cipher with the same IV and key.
Let us point out here that it would also be sufficient to assume that the attacker is able to force the internal state to be equal to some S (t) for t ≥ 0 and the attack works nonetheless. In practice, however, it is usually easier to force it to S (0) by simply re-initializing the cipher, also known as nonce-misuse. If the attacker has hardware access to the device under inspection, this assumption surely poses no problem. For the injection itself we only have small restrictions, namely that the fault must happen in a short time period and only in one single byte representing exactly one of the registers.
Assumption 3. For a known T ⊆ N we are able to inject a random byte-fault ε ∈ F 8 2 in a random register R ∈ R of the internal state at a random time t ∈ T during the generation of the keystream. (This means that the value R (t) of register R at time t is replaced by R (t) + ε.) To be more precise, we assume that an injection at time t happens before the corresponding keystream output z (t) is computed, i.e., an injection at time t can already affect the keystream output of the same time t. Notice that this assumption does not require us to know the injected fault ε, nor its precise temporal or spacial resolution. The only requirement we have is that the injections happen in a short time period T. For the random distributions of the faults and the affected registers we also impose no restrictions.
Let us point out right away that our experimental results were obtained using uniform distributions and the time period T = {1, . . . , 10}, i.e., the first ten clock cycles. We would expect similarly good results for other distributions and sets T.
In the following, we fix a finite T ⊆ N and assume that the above assumptions are satisfied for this set T. Recall that we also fixed an instance of the Enocoro stream cipher S with internal state S (t) , keystream z = (z (t) ) t≥0 , and set of registers R. Definition 1. Let (R, t) ∈ R × T, letz = (z (t) ) t≥0 be the keystream produced by the cipher if a fault ε ∈ F 8 2 is injected into R at time t, and let z diff = z +z. (a) The tuple (R, t) is called an injection point. (b) The tuple (R, t, z diff ) is called a fault injection, where (R, t) is its injection point,z = z diff + z its faulty keystream, and z diff its differential keystream.
If the injected fault ε ∈ F 8 2 is relevant in the context, we also denote the fault injection (R, t, z diff ) by (R, t, ε, z diff ).
Under our assumptions an attacker is able to inject a fault ε ∈ F 8 2 at the injection point (R, t) ∈ R × T and observe the faulty keystreamz, without knowing ε or (R, t).
From the original keystream one can then simply determine the differential keystream z diff . Hence, for every fault injection (R, t, ε, z diff ), we assume that an attacker is only in possession of z diff .
In the following section, we discuss the problem of finding the injection point of a fault injection and how one can use this information to gather information on the internal state of the cipher.

Differential Clocking Algorithm and Fault Localization
At the heart of every differential fault attack lies the propagation and computation of so-called differentials. These are the in-and output differences of a non-linear function used in the cipher. To precisely track the effect of a fault injection on the internal state and the keystream output we introduce a Differential Clocking Algorithm. This algorithm returns a set of linear polynomials that relate distinct differentials with one another and with the keystream output. As a consequence, it tells us exactly if-and how-certain differentials can be computed from the faulty keystream. Moreover, the output helps to decide whether a given differential keystream was produced by a fault injection at a certain injection point.
We use the following definition of a differential; notice that we only consider differentials in the non-linear function s 8 . This is sufficient here as no other non-linear functions are used in the specification of the Enocoro stream ciphers.
is called a differential in x, δ its input difference, and ∆ its output difference, respectively.
The goal of a fault injection is to find as many differentials as possible, since every differential in x gives us information on x, as the next remark points out.
Remark 1. Let R ∈ R, t ≥ 0, and (δ, ∆) be a differential in the value R (t) ∈ F 8 2 of the register R at time t. Then we clearly have: So every differential in R (t) corresponds to a set of bit-tuples that contains R (t) . Given multiple differentials (δ 1 , ∆ 1 ), . . . , (δ s , ∆ s ) in the same register value R (t) , we also have: Table 1 shows the average size of this set for one and more differentials. One can see that in a vast majority of all cases, two differentials suffice to uniquely determine R (t) . (Also notice that for s = 1 we have at least two solutions, as both R (t) and R (t) + δ are in S (δ,∆) .) Before we discuss the use of differentials in detail, we present the Differential Clocking Algorithm that forms the basis of all subsequent analyses.
Then from the specification (see Section 2) we know that clock 0 , . . . , clock n b −1 are all linear. Additionally, also keystr is a linear function.
For τ ≥ 0 we now define the polynomial ring: To keep the presentation simple, we abbreviate (e 1 , . . . , e m ) by e, (z Algorithm 1: Differential Clocking Algorithm. Append the eight linear polynomials keystr(S (i) ) − z (i) ∈ (P τ ) 8 to L. 5 for j = 1 to 4 do 6 Append the eight linear polynomialsb

end 12 return L
Conceptually, this algorithm traces the propagation of a generic fault e = (e 1 , . . . , e 8 ), injected at the injection point (R, t), through the internal state and its differential keystream output, represented by the indeterminates z (i) . The algorithm keeps track of the structure of the differential internal state in terms of the fault and its propagations. With every clocking, the (differential) state,S gets updated as follows: if a register value is updated linearly, we can use the linear update function for the differential state as well (line 9). If the update function is non-linear, we introduce new indeterminates D (t) i representing the output difference in that specific part. This is done in line 10 where we replace each output of the function s 8 with inputs b k j by a new indeterminate. At the same time, we also keep track of the inputs to these non-linear functions using the indeterminates d In that context, line 7 is correct since s 8 is bijective: if the input difference is zero, then the output difference has to be zero, too. Finally, line 4 ensures that the differential keystream is related correctly to the differential internal state.
From this, we immediately see that, among others, the linear polynomials in L clearly relate the differential keystream indeterminates z (i) with the differentials in the values of the registers b k 1 , b k 2 , b k 3 , and b k 4 at all the times steps i ∈ {t, . . . , τ}.
For better formalization, we denote the ideal generated by the output of Algorithm 1 by L τ (R, t), where (R, t) ∈ R × T is the injection point and τ ≥ 0. Moreover, the elimination ideal m | 1 ≤ m ≤ 8] in the keystream indeterminates will play a major role later on. Notice that a generating set of this ideal can be found via an appropriate Gaussian elimination of the linear polynomials in L τ (R, t). Moreover, for a set of polynomials S ⊆ F 2 [x 1 , . . . , x n ] we denote its set of zeros in F n 2 by: Z F 2 (S) := {(a 1 , . . . , a n ) ∈ F n 2 | f (a 1 , . . . , a n ) = 0 for all f ∈ S}. Now the next result is an immediate consequence of the above discussion.
for 0 ≤ i ≤ τ which appear during the generation of the faulty keystream of the fault injection. Then we have and thus, in particular (z

Remark 2.
In view of the first part of this proposition, we see that one might be able to determine certain input and output differences from the differential keystream z diff , if the injection point (R, t) is known. For this, we compute a generating set of L τ (R, t) and consider the substitution If the resulting system of linear equations has a unique solution for some of the indeterminates d j , respectively. Ideally, this happens for matching pairs of differences such that one can obtain a differential in some register value b For Enocoro-128v2 it turns out that, for each injection point, between four and six differentials in the register b 2 can be found. All of this is, however, only possible, if the precise injection point is known. Thus we now focus on how to determine this injection point given just a corresponding faulty keystream.
A straightforward approach for this uses the previous proposition: Given the differential keystream z diff , one can check for all injection points (R, t) ∈ R × T whether Clearly, a limitation of this method is that if the sets Z τ (R, t) and Z τ (R , t ) are equal for two injection points (R, t) and (R , t ), the injection points cannot be distinguished. This hints at the more fundamental issue that, in general, there is no unique answer to our question, i.e., there are differential keystreams that can be produced from fault injections into distinct injection points.
If two fault injections are equivalent according to this definition, then all the input and output differences are based in the same way on the previous differences and, in particular, the fault ε. Thus for any two fault injection sequences with the same fault ε in those injection points, the differential keystreams must be equal. This shows that points in the same equivalence class cannot be distinguished using just the differential keystream. However, since our fault analysis is based on these sets, L τ (R, t), it is also sufficient to just determine the equivalence class.
Under certain conditions of the stream cipher's parameters, the number of time steps it takes for a fault to affect the keystream is determined uniquely by the injection point, independent of the values of the internal state and the injected fault. This allows us to attribute the injection point corresponding to a given differential keystream to one of just nine equivalence classes. In particular, this result is applicable for our attack on Enocoro-128v2 (and also for Enocoro-80).

Proposition 2.
Assume that the parameters of the Enocoro stream cipher S satisfy: Let ψ : R × T → N be the map assigning each injection point (R, t) to the index s ∈ N, as above.
contains at most nine equivalence classes.
Proof. For part (a) it suffices to show that it takes a fixed amount of time steps less than to reach the keystream. If R ∈ {a 0 , a 1 }, then a non-zero keystream difference can be found in z (t) or z (t+1) , respectively. Now we can assume that R is part of the buffer, which we decompose as follows: By definition of clock, the fault can only spread to the differential keystream via the registers b k 1 , b k 2 , b k 3 , b k 4 , and a 0 , a 1 . Thus we only need to show that a fault in R (t) spreads to one of these registers in a small enough fixed number of time steps. By assumption on the parameters of the Enocoro cipher S, a fault in one of the above sets stays in the registers of that same set until it reaches exactly one of the registers in {b k 1 , b k 2 , b k 3 , b k 4 }. Moreover, the precise number of steps until this happens is determined by (k 1 − t) mod n b , k 2 − t, k 3 − t, and k 4 − t respectively. If the fault is in b k 1 , b k 2 , or b k 4 , then after just one additional clocking it is present in a 1 and thus affects the keystream of the same time stamp. In case the fault is in b k 3 , two additional time steps are required. One for the fault to get to a 0 and another one to spread to a 1 . As k 4 − k 3 > 1 we also know that the fault in a 1 must be non-zero, since no other input for the state update of a 1 can already be corrupted. The claim is an immediate consequence.
In order to prove (b), notice that the above shows that the set ψ −1 (s) is a subset of : A straightforward analysis shows that the injection points in each of those nine sets are indeed equivalent.
For this proposition the actual equations in L τ (R, t) are irrelevant, the result is derived only from the structure of the Enocoro stream ciphers. In combination with Proposition 1, this gives us a fairly good tool to find a small set of equivalence classes containing the class of the actual injection point.

Proposition 3 (Locating Injection Points).
Assume that the parameters of the stream cipher S under consideration satisfy: injection and let s ≥ 0 be minimal with z (s) diff = 0. Then, (R, t) belongs to one of the, at most, nine equivalence classes contained in: This proposition admits an efficient implementation: the equivalence classes in ψ −1 (s) ⊆ R × T can be represented by less than nine elements, and in view of the proof of Proposition 2 we can immediately derive those from s. Now, it suffices to check for each one of these injection points (R, t), if the first τ + 1 differential keystream outputs are a zero of the (linear) generators of Z τ (R, t), as this ideal is equal for equivalent injection points. The parameter τ should only be chosen large enough that the ideals, Z τ (R, t), allow us to distinguish distinct equivalence classes as well as possible.
Remark 3. One might think that the larger τ is, the more linear polynomials one can find in Z τ (R, t). Although a larger τ surely increases the size of L τ (R, t), the number of indeterminates that appear in this set increases as well. Close inspection of Algorithm 1 shows that the output contains between 32t + 40τ + 64 and 72τ + 72 linear equations in 72τ + 80 indeterminates. Thus, the system never has a unique solution. Furthermore, the more coordinates of the differential statẽ S (t) are non-zero (in particularb k 1 ,b k 2 ,b k 3 , andb k 4 ), the fewer polynomials are added to L in line 6. Now, it suffices to notice that with each additional loop-iteration, more entries ofS (t) are assigned to non-zero entries. This suggests that for some large enough τ ≥ 0 the ideal Z τ (R, t) could get stationary.
In the remainder of this section, we show the implications of the previous propositions for Enocoro-128v2. As indicated by the last remark, in practice, linear equations in the keystream indeterminates can only be derived from the first few non-zero differential keystream outputs.

Example 2.
For Enocoro-128v2 we have that Z t+8 (b 0 , t) = · · · = Z t+80 (b 0 , t) and this ideal is generated by 24 + 8t linear polynomials. For other injection points, we also come to the conclusion that Z τ (R, t) seems to get stationary already for τ = t + 8.
The next remark collects some special properties we could observe when applying Proposition 3. + 3 ≤ τ ≤ s + 80, one can check that the sets Z τ (R, t) for these representatives form only a total of four distinct ideals; these are: (c) Our experiments indicate that we are always in that best case. This means that an injection point is attributed to exactly the correct set of equivalence classes. Or in other words, there seems to be no differential keystream z diff that is a zero of more than one of the above four ideals. So, in practice, we can determine a set comprising of less than three injection points where one of them is equivalent to the real injection point. This result is also summarized graphically in Figure 2. The injection points with the same base colour share the same ideal Z τ (R, t), and those that also have the same colour shading are indeed equivalent. (d) At last, we want to remark that in each of those four groups of equivalent injection points, the timings of the injection points are distinct. So if the time t of the injection point (R, t) is known, we can directly infer to the register R, and by that, uniquely identify the injection point.
To put it differently: if an actual physical implementation of our fault model allows us to measure the precise time a fault was injected in an instance of Enocoro-128v2, then, in practice, the corresponding injection point can be found with little effort. Using this additional information, our fault attack becomes considerably more efficient, while requiring the same small number of fault injections.  To finish this section, recall Remark 2, where we explained how to compute input and output differences from the set L τ (R, t) and the differential keystream z diff , if we are given a fault injection (R, t, z diff ). Table 2 gives an overview of the number of input and output differences that can be computed, as well as the register values in which those can be combined to differentials. Since these numbers only depend on the ideal L τ (R, t), it suffices to list one injection point for each equivalence class. Table 2. Number of input and output differences that can be computed from L t+60 (R, t) given an injection point (R, t), in the case of Enocoro-128v2, as described in Remark 2.

Inj pt
# In-/Output Diffs Differentials in

Remark 5.
For an actual physical implementation of a fault injector satisfying our assumptions, the injection points that can be observed adhere to some probability distribution on R × T. If this distribution is known (or can be estimated), one can assign, to every equivalence class, the probability that the injection point lies in it based on the elements that the equivalence class has in common with R × T. This is the basis of a minor optimization in the fault attack where we perform an exhaustive search over the different injection points that come into question after using Proposition 3.

Combining Fault Injections
In this section we show how the information gain of several fault injections can be combined in a very structured and automated way. The core idea for this was first developed in [19]. There, a connection was drawn between Horn clauses that resemble relations between the values of registers at different time steps, guess-and-determine (GD) attacks, and fault attacks. In this work, we extend this approach to not only consider exact relations among the register values, but also containments of register values in small sets that are derived from other register values.
For completeness, we first introduce the original approach, before we explain our adjustments.

Construction 1. Let τ ≥ 0 be the number of time steps to consider. Introduce propositional logic variables
R corresponds to the truth value of the statement 'The value of register R at time t is known.' Now relations among the values of the registers at consecutive times can be derived from the next-state-function clock. These relations are of the form: with R, R 1 , . . . , R k ∈ R and t, t 1 , . . . , t k ∈ {0, . . . , τ}. This means that if the value of R i is known at time t i for all i = 1, . . . , k, so can be the value of register R at time t. In the logic setting this corresponds to the implication: In particular, such implications are Horn formulas, and we also denote them by the equivalent formula ¬A (t 1 ) We collect all Horn clauses that one can derive in this way from the first τ ≥ 0 time steps using the definition of clock in the set C τ .
The following remark gives a few more details on how this set C τ can be constructed in the case of the Enocoro stream cipher.

Remark 6.
For any Enocoro stream cipher S every equation that is used to define the next-statefunction clock admits as many clauses as distinct register values occur in them. As an example, 0 which holds for all t ≥ 0. By rewriting it to b (t) n b −1 we get two more equations. Altogether they correspond to the three Horn formulas: By careful counting, we see that this construction admits exactly (n b + 16) · (τ − 1) clauses.
In the following we denote the set of logical variables appearing in a set of clauses C by Vars(C) = {A 1 , . . . , A n }, and for F ∈ C and (a 1 , . . . , a n ) ∈ F n 2 we define F(a 1 , . . . , a n ) as the truth value of F under the assignment A i → a i , where 1 = true and 0 = false, as usual. The set of satisfying assignments of C is denoted by: Moreover, we denote the well-known Marking Algorithm (see [31]) of Horn logic by Mark. Recall that it takes a set of Horn formulas as input and returns the unique satisfiable assignment of minimal hamming weight, if it exists.
In [19], [Corollary 6.9] a deterministic algorithm based on Gröbner basis computations was proposed to find minimal sets G = {{A i 1 }, . . . , {A i k }}, such that Mark(C ∪ G) = (1, . . . , 1). Such a set, G, is called a guess basis for C. If C = C τ as in the above construction, then the following two statements are equivalent: R k }} is a guess basis for C τ , and the internal state can be computed from R This already indicates that every guess basis G for C τ gives rise to a guess-anddetermine (GD) attack with an attack complexity of at most 2 8·#G . Using mixed-integer linear programming solvers it was shown in [11] that for Enocoro-128v2 the smallest guess basis for C 16 , as in Construction 1, consists of at most 18 logical variables. Thus, the best GD attack requires a guess of no more than 18 register values, which is just slightly more complex than guessing the 16 bytes of the key and IV. This result tells us that the fault injections should provide us with information on the internal state comparable to at least 18 register values in order to be able to retrieve the internal state.
The following remark motivates as to why looking at several fault injections independently is not likely to result in a fault attack with complexity below 2 32 . Table 2 we see that every fault injection admits the knowledge of up to five differentials for the register b 2 . Assume that we repeatedly inject faults where the fault injection point can be determined uniquely, and ultimately we are able to derive so many differentials such that the value of b 2 can be determined for m consecutive time steps. Note that this requires at least 2 · m 5 fault injections, as a single differential can never uniquely determine a registers value. To check whether we can compute the internal state from these register values we can switch to the logical setting: let τ ≥ 0 and C τ as in Construction 1. Moreover, let F m = {{A A simple computation shows that Mark(C 500 ∪ F m ) = (1, . . . , 1) for all 1 ≤ m ≤ 500. This means that even more than 200 fault injections are not sufficient to determine the internal state directly. Nonetheless, it might be possible that guessing just some additional register values might make this feasible. As long as we only need to guess a small amount of values this is clearly acceptable. This corresponds exactly to finding an optimal guess basis for C τ ∪ F m . Using the Gröbner basis method of [19], we computed those for C 35 ∪ F m where 1 ≤ m ≤ 35. All of these guess bases contained more than four elements, i.e., the internal state can only be determined if, additional to the fault injections, at least four bytes are guessed.

Remark 7. Using
This shows that a search space 2 32 is still left, even after considering more than 14 fault injections.
The remark indicates that looking at just the differentials that can be deduced from multiple fault injections isolated from the others, is not ideal. Therefore we are in need of a more sophisticated approach that better captures the interplay of various fault injections.
In the following, we take a different perspective on the concept of a guess basis. Instead of looking at the register values that need to be known to be able to determine the internal state, we focus on how this is done, i.e., the order in which the internal state can actually be computed and which formulas (and functions) are used in every step. Moreover, instead of functions that resemble exact relations among the register values, we allow functions that omit a small set in which another registers content is contained. To cope with the fact that these sets may-on average-have different sizes, we accompany the corresponding Horn clauses with a weight resembling the average size of this set.
Using this refined concept, we capture the information we gain on the internal state using multiple fault injections better. This includes modelling all the input and output differences that occur during the propagation of a fault injection by logical variables such that we do not only utilize those which are deduced from the differential keystream right away.
In the next definition, we denote the set of negative literals of a clause F by Neg(F), and the set of positive literals by Pos(F). Informally speaking, a guessing path G represents the order in which the Marking Algorithm applied to {F i 1 , . . . , F i s } can choose the clauses for the markings. Thus it is also clear that Mark(C) = (1, . . . , 1), if and only if there exists a guessing path G in C. In fact, the notion of a guessing path can be seen as a generalization of the concept of a guess basis.

Remark 8.
Let G be a guess basis for a set of Horn clauses C, and let C = C ∪ {{A} | A ∈ Vars(C)} be a set of weighted Horn clauses where all F ∈ C have weight 0, and all F ∈ C \ C have weight 1. Then we have: is a guess basis for C. In particular we have #G = ω(G). (b) If G = {{A 1 }, . . . , {A n }} is a guess basis for C, let G be the order in which the clauses of G ∪ C ⊆ C are processed when computing Mark(G ∪ C). Then G is a guessing path in C with ω(G) = #G. (c) If G is an optimal guess basis for C, then we have for any optimal guessing path G in C that ω(G) = #G.
With this improved notion we are able to adequately treat relations arising from i ∈ S (δ,∆) . More precisely, we construct a set of weighted Horn clauses as follows.

Construction 2.
Let τ ≥ 0 be the number of time steps to be considered, and consider a set {(R 1 , t 1 , z diff,1 ), . . . , (R m , t m , z diff,m )} of m ≥ 0 fault injections. For every 1 ≤ s ≤ m denote the differentials that appear during the fault propagation of the injection (R s , t s , z diff,s ) in the registers b (i) k j for every 0 ≤ i ≤ τ and 1 ≤ j ≤ 4 by (δ (i) j,s , ∆ (i) j,s ). Furthermore, denote the set of all input and output differences of this fault injection by D s .
Introduce a set of propositional logic variables: with the following correspondence: D j,s ←→ 'The output difference in b k j at time t of the fault injection (R s , t s , z diff,s ) is known.' Similar to Construction 1, the next-state function clock and the fault injections admit relations of the form: This means that if the value of R i is known at time t i for all i = 1, . . . , k, one can determine the value of register R at time t as an element of a set. We translate this to our logical setting as the Horn clause: (A (t 1 ) with weight w = log 2 2 −8k ∑ (r 1 ,...,r k )∈(F 8 2 ) k #ϕ(r 1 , . . . , r k ) equal to the logarithm of the average size of the set the relation is based on.
There are various ways to construct such a set of weighted Horn clauses and their associated functions for Enocoro stream ciphers. Clearly, one can adapt the functions that appear in Construction 1 such that they return a set containing exactly one element, and assign their respective clause the weight 0.
In view of Remark 7, it can also be useful to allow the guessing of entire register values. To model this, see Remark 8, we consider the fact clauses {A (i) R } for each R ∈ R and 0 ≤ i ≤ τ of weight 8, whose corresponding functions take no input and return the set F 8 2 .
The clauses so far use no fault information. Now let us describe precisely how to construct clauses encoding the information gain by fault injections and their mutual interplay.

Remark 9.
Consider the setting as in Construction 2. The desired weighted Horn clauses and their respective functions can also be found in the following ways: (a) In Remark 2, we explained how certain input and output differences in D s can be computed from the differential keystream z diff,s using the set L τ (R s , t s ) ⊆ P τ . For each such difference d ∈ D s , we can consider the corresponding logical variable A d as a fact clause of weight 0.
The corresponding function takes no input and returns the set containing only the value of the respective difference.
(b) From every differential (δ j,s ) we get the following three relations: which give rise to three weighted Horn formulas: where the weight of the first formula can be deduced from Table 1.
Additionally, for every subset F ⊆ {1, . . . , m} we also have b . The corresponding weights can again be deduced from Table 1 if #F < 4. (c) The set L τ (R s , t s ) also contains linear equations that relate the bits of the input and output differences in D s to one another. Each subset of those equations that relates all bits of one difference with the bits of other differences can then be translated to a Horn clause of weight 0.
The set of all such relations can be computed using linear algebra techniques. However, we do not want to consider all those exponentially many clauses, since we need to quickly process them later on. Thus-for our experiments-we used the following strategy to find a set of such (linear) relations: (1) Replace the keystream variables z (i) by the differential keystream z From now on, let C τ,( (R 1 ,t 1 ,z diff,1 ),...,(R m ,t m ,z diff,m ) ) be a set of weighted Horn clauses as described above. Notice that the last part of the remark ensures that there has to exist at least one guessing path. Given any such guessing path, Algorithm 2 can be used to determine the internal state of the stream cipher. (Notice the similarity to the GD attack framework in [19].) Proposition 4. Algorithm 2 is an algorithm that computes the inital state S (0) of the stream cipher.
3 for i = 1 to n do 4 Compute S i = ϕ i (r j i,1 , . . . , r j i,k i ) where ϕ i denotes the function associated to F i .

5
if S i = ∅ then 6 Choose r i ∈ S i and remove it from S i .

7
Let S be the output produced when proceeding with the next iteration. Thus, the algorithm terminates correctly. Since S i ⊆ F 8 2 , by construction, lines 6 and 7 are performed less than 2 8 times in a single iteration. By induction on n, we also see that every step of the algorithm is performed at most 2 8n times. This shows that the algorithm terminates in finitely many steps. Now we can also see why the construction of the weights as the logarithms to base 2 of the average size of the expected size of the sets S i is indeed useful.

Remark 10.
Let C be a set of weighted Horn clauses as in Construction 2, and let G be a guessing path in C. Assume that the values corresponding to the variables in Vars(C) behave like independent random variables of a uniform distribution. Then the expected size of the search space considered by Algorithm 2, used with the guessing path G, is bounded from above by 2 ω(G) .
Next, we compare the Guessing Path Attack to an approach using SAT solvers for combining partial information.
Remark 11. Algorithm 2 proceeds in a way which bears some similarity to the approach of a DPLL-based SAT-solver. There we deal with Boolean variables instead of registers values here, the role of unit propagation there is taken by the guessing path here, and conflicts arise as empty sets S i (in line 5) and incorrect keystreams (in line 14).
Therefore, it appears possible to mimic the Guessing Path Attack with a modern SAT-solver by using the given guessing path as a variable guidance strategy and expressing the functions corresponding to weighted Horn clauses as CNF-formulas. (Here each register would have to be represented by eight Boolean variables.) While such a SAT-based approach could possibly deliver good timings, since it can learn from incorrect intermediate assignments, it also requires a substantial amount of work to construct an appropriate CNF encoding. It is unclear whether one can appropriately adapt the variable selection routine of the chosen SAT solver. In contrast to this, our Algorithm 2 admits a straightforward recursive implementation and proves to be more than sufficient for the requirements of our attack (see Section 7).
As a next step towards our fault attack, we need an efficient way to find guessing paths, since our attack must do this quite frequently. In general, finding optimal guessing paths is a hard problem (in particular harder than computing minimal guess bases, see Remark 8 and [19]). So, instead of finding an optimal one we restrain ourselves to finding paths of small weight. For this we propose Algorithm 3 as a greedy heuristic, based on the Marking Algorithm.
Input : A set C of weighted Horn clauses. Output : A guessing path G in C if there exists one; otherwise ∅.
1 Let G be an empty sequence in C and let M = ∅.
Choose F ∈ L of minimal weight.

5
Append F to G and adjoin Pos(F) to M.
Clearly, this algorithm cannot find optimal guessing paths in general, and should also not be expected to construct good GD attacks all the time. However, in our experiments this simple heuristic proved to be a satisfactory choice which yielded acceptable results.

The Fault Attack on Enocoro Stream Ciphers
In this section we combine the algorithms and ideas of the previous two sections and present our fault attack on the Enocoro stream cipher. The attack is constructed in such a way that a new fault injection is only required if the algorithm does not expect to find a solution in a small enough search space. The algorithm takes two parameters as input: w max determines the maximal search space size that should actually be considered in the final step, and τ ≥ 0 specifies the number of time steps that are taken into account for the internal state retrieval. Proposition 5. Algorithm 4 is a Las-Vegas algorithm that computes the initial state S (0) , i.e., it may not terminate but when it terminates it computes S (0) .
Proof. If the algorithm terminates, then it must stop in line 11. Now the output is correct by virtue of Algorithm 2.
In every iteration of line 2, our fault attack consists of up to two stages of exhaustive search: the first is to determine equivalent injection points for the m fault injections (line 6); the second one is outsourced to Algorithm 2 in line 10, but is only performed if it is reasonable to expect a small enough search space by virtue of line 9 (see Remark 10). Notice also that, by Proposition 2, the set L may contain more than 9 elements in every iteration. Therefore, in the worst case, line 6 is executed more than 9 m times. Because of that, our implementation uses (large) lookup-tables for line 7. This is also the reason why we do not recommend to search for optimal guessing paths in line 8, as it would just cost too much time. Instead, Algorithm 3 is a good enough choice. Note also that this heuristic is based on the Marking Algorithm which is known to be a linear-time algorithm [31]. Thus, it should be investigated if Algorithm 3 can be implemented with linear time complexity as well.

Algorithm 4: Enocoro Fault Attack Framework.
Input : An instance S of the Enocoro stream cipher with a fixed key and IV; parameters τ ≥ 0 and w max ≥ 0. Output : The initial state S (0) of S.
Increase m by one, reset S, start the keystream generation, and inject a random fault at a random time in T in a random register. Determine the first τ + 1 differential keystream bytes z τ diff,m from the faulty keystream.
Compute a set of Horn clauses C τ,F and their associated functions as described in Construction 2 using Remark 9.

8
Compute a guessing path G in C τ,F of small weight ω(G) using Algorithm 3. 9 if ω(G) ≤ w max then 10 Apply Algorithm 2 to the guessing path G and let S be the output.

11
if S = ∅ then remove F from F else return S. As a minor optimization, the order in which the elements F ∈ F are processed in line 6 can be chosen corresponding to the probabilities that the fault injections lie in the respective equivalence classes, as indicated already in Remark 5. This has no effect on the overall number of fault injections that are required for a successful attack, but ensures that the algorithm terminates faster on average.

Experiments and Timings
In this section we present experimental results of our fault attack (Algorithm 4) applied to Enocoro-128v2 using an implementation in the computer algebra system CoCoA-5 [28]. This system has a Python-like language and offers many specialized built-in functions, such as fast Gröbner basis calculations. Moreover, we were able to build upon the implementations for [19], shortening the implementation time and simplifying the code.
The set T that specifies the temporal resolution of the actual injections was chosen as {1, . . . , 10}. The results are based on a total of 500 simulations each, and the computations were carried out on a machine with an Intel Xeon E5-2623 v3 (3.00GHz) processor and 128 GB of RAM.
To aid the selection of the parameters τ and w max , we first consider a simplified version of the attack where the set F , in line 6, contains only a single element in every iteration. This corresponds to a fault attack with a fault model that also allows the attacker to retrieve the exact injection point. (In light of Remark 4.d this is, in practice, also equivalent to knowing just the time of the injection.) Observe that the number of fault injections required for a successful retrieval of the initial state and the weight of the corresponding guessing paths is the same as with our fault attack using the weaker fault model of Section 3. Thus, bad parameters for this simplified version are also bad ones for the full version.
From Proposition 3 we know that τ should be chosen larger than 23. Here we consider τ ∈ {25, 30, 35, 40} and w max ∈ {16, 20}. Table 3 presents the timing of the simplified fault attack for these parameters. There one can see the weight of the final guessing path w = ω(G) of line 10 along with the averageŵ of the exponential size of the search space that actually had to be considered. Furthermore, we provide the cumulative time of the repeated applications of our heuristic (Algorithm 3) used in line 8, the overall running time, and the average number of faults that had to be injected. Table 3. Timings of the simplified fault attack on Enocoro-128v2 where we considered T = {1, . . . , 10} and assumed that the injection points are always known. The results were obtained using 500 random key-IV pairs.

Params
Avg For larger parameters we did some preliminary experiments and can report that the variance of the timings increased significantly. Already for τ = 40 and w max = 16, the timings ranged between a few seconds up to over 10 h. Since we target a fault attack that works well in the clear majority of cases, we did no further investigations. Nonetheless, the actual size of the last search space exceeded 2 w max only rarely, i.e., Remark 10 seems to be applicable in practice.
As for smaller parameters, we also ran experiments with τ = 30 and w max = 2. On average, such an attack requires 9.96 fault injections and takes no more than 90 s to solve for the internal state. This shows that, with the simplified version of our attack corresponding to the stronger fault model, the internal state can be retrieved in negligible time with only about 10 fault injections. (In light of Remark 7 this shows that our structured formalization of the information gain of several fault injections and their interplay pays off very well.) Looking at the table, we see that the differences between τ = 40, τ = 35, and τ = 30 can mainly be described as a significant decrease in the timing and a small (negligible) increase in the number of required fault injections. Only when we consider τ = 25, the trade-off between timing and number of faults becomes noticeable. Since the complexity of our final fault attack is expected to be exponential in the number of required faults, we consider the trade-off here as acceptable, and favour a slightly larger τ. This motivates why we only consider τ = 30 for the evaluation of our fault attack.
In Table 4 the results of 500 simulations of our fault attack on Enocoro-128v2 (based on the fault model of Section 3) are shown. As above, we give information on the total time spent on finding light-weighted guessing paths with Algorithm 3, the average weight of the last iteration of line 6, the logarithm of the average search space size that had to be considered, and the average number of faults that needed to be injected. Just as one could expect, the variance of the timings increased significantly, compared to the simplified attack. Nonetheless, in 95% of all cases the attack finished within 2.5 h of computation. Moreover, we see that the attack is faster if we choose the larger value of w max . At first this may seem counter-intuitive, as this parameter bounds the search space size of the inner exhaustive search of the attack. However, the choice of w max also affects the number of fault injections that are required for the total attack. Consequently, the outer exhaustive search is enlarged. Hence, in view of the fault model, we can see that fewer fault injections actually benefit the timings. Choosing τ = 30 and w max = 20, we end up with a fault attack that is able to retrieve the internal state, and thereby break the cipher, in about 20 min using less than five fault injections on average.
To conclude, we have presented a fault attack framework for the family of Enocoro stream ciphers that is based on a rather weak fault model, and successfully applied it to the standardized cipher Enocoro-128v2. This clearly shows that appropriate countermeasures should be developed and installed. One suggestion that we would like to offer is to insert a non-linear filter near the keystream output into the definition of the cipher.