Low-Complexity Nonlinear Self-Inverse Permutation for Creating Physically Clone-Resistant Identities

: New large classes of permutations over  based on T-Functions as Self-Inverting Permutation Functions (SIPFs) are presented. The presented classes exhibit negligible or low complexity when implemented in emerging FPGA technologies. The target use of such functions is in creating the so called Secret Unknown Ciphers (SUC) to serve as resilient Clone-Resistant structures in smart non-volatile Field Programmable Gate Arrays (FPGA) devices. SUCs concepts were proposed a decade ago as digital consistent alternatives to the conventional analog inconsistent Physical Unclonable Functions PUFs. The proposed permutation classes are designed and optimized particularly to use non-consumed Mathblock cores in programmable System-on-Chip (SoC) FPGA devices. Hardware and software complexities for realizing such structures are optimized and evaluated for a sample expected target FPGA technology. The attained security levels of the resulting SUCs are evaluated and shown to be scalable and usable even for post-quantum crypto


Introduction
The demand for a more efficient and secure physical identification of the participating entities in a security architecture is a permanent technological need. Unclonable physical identities are emerging to play a crucial role in contemporary and future security systems such as those dealing with smart houses, smart cities, smart healthcare, etc. [1]. In the last two decades, many physical identification approaches have been proposed based on simply verifying a stored secret key in an embedded non-volatile memory (NVM) [2]. This approach is inherently cloneable as somebody knows the stored secret keys. moreover, it mostly failed to face simple replacements/physical attacks [3]. A more sophisticated approach was proposed based on "unknown" Physical Unclonable Functions (PUFs) serving as a physically unclonable identity for RFIDs, smart cards, mobile devices, and, generally, the Internet of Things (IoT) devices [2]. PUFs use intrinsic properties of electronic device structures which cannot be manufactured to become identical and is comparable to a borne biological DNA. PUFs properties are fully random, obscure, and unpredictable. Therefore, PUFs are theoretically considered as unknown functions that are impossible to clone [4]. However, all PUFs share the same property of being inconsistent as unknown continuous "analog" mappings.
Our proposed "digital" Secret Unknown Cipher (SUC) concept was introduced a decade ago in [5] as electronic DNA (e-DNA). SUCs physical modules compared to PUFs, are highly consistent as pure digital structures. SUCs are considered as "clone-resistant digital functions" that can be realized as digital structures. SUCs are self-created and embedded within off-shelf System on Chip (SoC) Field Programmable Gate Arrays (FPGA) devices in a post-fabrication process, where the device manufacturer can be excluded from the security process. This work shows a possible approach towards creating such SUCs in emerging future SoC FPGA technologies. The particular property of the proposed cipher classes for SUC is that they use existing unused-arithmetic-hardcore-modules that are already available in most modern FPGA resources.
The main contributions of this work are in devising a new technique for converting future smart programmable VLSI devices into physically hard-to-clone units as an alternative to conventional PUF technologies. The technique is based on a new concept in creating permanent and highly resilient digital SUC. Therefore, new special huge classes of FPGA-optimized ciphers are introduced based on low-cost self-inverse permutation functions deploying special FPGA arithmetic-hardwired-cores. The targeted VLSI technologies are emerging future smart self-reconfiguring and non-volatile SoC FPGA devices to accommodate the proposed SUC-modules. Such technologies are expected to emerge in the near future. The paper introduces a new unknown cipher generator based on new selfinverse permutations to construct digital SUCs to replace the inconsistent analog PUFs in a large class of applications.
The paper is organized as follows.
• The state of the art of PUFs as recently used unknown functions in Section 2. Other proposals of unknown functions were carefully reviewed in [6].

•
In Section 3, the creation process of SUCs is presented in more details to make the paper selfcontained.

•
Then the basic algebra to be deployed for cipher construction is defined based on an expected future VLSI environment in Section 4.

•
In Sections 5, 6 and 7, the novel found large classes of self-inverse permutations over 2 n  are introduced.

•
In Section 8, the hardware implementation complexities are evaluated. • Finally, the resulting cipher structures and their attained security levels are discussed in Section 9. • Section 10 concludes the results.

Background Motivation and State of the Art on Physical Unclonability
Tremendous research efforts were conducted two decades ago to create unclonable entities. The conventional Physical Unclonable Functions (PUFs) as technologies to make physical units unclonable were introduced in the last two decades such as in [2], [7], and [8]. A PUF was first introduced as a physical one-way function in an optical environment [9]. Then, the PUF was proposed as a controlled physical random function in [10]. Later, the PUF was defined as an unknown function [11]. Several electronic PUFs were presented such as ring oscillator PUFs [10] [12], TERO-PUF [13], arbiter PUFs [14], Chaos-based PUFs [15], etc. Furthermore, PUFs were classified into two categories: If a PUF can produce an enormous number of input-output pairs (challenge-response pairs (CR-pairs)), then it is assigned as a Strong PUF. Otherwise, the PUF is assigned as a weak PUF if it produces a limited number of CR-pairs [2].
In the following sections, several proposals deploying PUFs are selected to be reviewed as being related to our work. The following technical discussions cover also PUF-vulnerabilities, as well as modeling attacks, as the most important threats facing PUF technologies.

.PUFs Drawbacks and Disadvantages
As nonlinear and unknown analog mappings, most PUFs output responses cannot be perfectly reproducible for the same input stimuli. Such a response inconsistency occurs due to many reasons such as environmental perturbations such as supply voltage, noise, operation temperature, aging, and many other specific effects. The PUF-Drawbacks can be classified as follows [2]: 1. Inconsistent input-output behavior and consequently inconsistency in the PUF's CR-pairs reproducibility. In addition to two mainly security drawbacks, namely: 2. Possible correlations in PUF CR-pairs allowing modeling attacks. 3. A limited or small number of possible distinct CR-pairs which simplifies cloning attacks.

Counteracting the Drawbacks of PUFs
Several PUF-improvements were presented to overcome PUF-drawbacks to make them more robust, reliable and secure. Such improvements were conducted in [16], [17], [18], [19] and [20] based on error-correction techniques and/or combining PUFs with additional cryptographic primitives.
To counteract PUF-inconsistency drawbacks, complex fuzzy extractors were introduced as a remedy [21]. Such fuzzy extractors deploy Helper Data Algorithm (HDA) and/or error correction code (ECC) procedures to stabilize the PUF's inconsistent responses. As a result, the need for a fuzzy extractor makes most of the proposed PUFs very costly to implement with more latency counteracting most requirements of a lightweight design. For instance, the proposed fuzzy extractor of [20] requires 5.1 times of the PUF's resources to produce more consistent responses.
Correlation drawbacks allow several security threats against PUF technology. Recently, modeling attacks were considered among the most important PUF-threats. The highly correlated PUF CR-pairs lead to building predictive models of the PUFs [22], [23], [24], and [25]. The general methods of modeling attacks on PUFs can be concluded as follows: • Modeling Attack using Machine Learning (ML) Algorithms: New PUF attacks based on modeling by machine learning (ML) were found as in [22] with alarming high prediction rates approaching 99%. The predictive models using ML techniques of various proposals of delay PUF are constructed with error ratios [22]: less than 1% for Arbiter PUFs, 1% for XOR Arbiter PUFs, 4.5% for Feed-Forward Arbiter PUFs, and less than 1% for Ring Oscillator PUFs.

•
Modeling Attack using PUF-Codebook: Several PUFs generates a limited number of CRPs [2]. For instance, Ring Oscillator PUFs produce a quadratic number of CRPs. Therefore, an adversary can produce a codebook of the PUF containing a look-up table of all CRPs to imitate the PUF [23]. For instance and in reference to Figure 1(a), a controlled PUF was proposed by combining a random hash function with a weak PUF to prevent modeling attacks in [10]. As also HDA is still required, the resulting overall high complexity is mostly not acceptable for practical use. Adapted from [10,20,26]. PRF = Pseudorandom function; LFSR = linear-feedback shift register; ECC = error correction code.
In reference to Figure 1(b) and according to [26], a PUF-based unknown key generation was proposed. The proposed mapping structure uses a PUF as a key generator and Pseudorandom Function (PRF) as input-output mapping resulting with large usable operational space with no correlation and large usable distinct pairs. Again, complex HDA is required resulting with overall non-attractive high-complexity for practical use.
In reference to Figure 1(c) and according to [20] using a weak PUF with a lattice cryptographic algorithm and a linear-feedback shift register (LFSR) combined to construct a strong PUF. The LFSRconnection polynomial is used to expand the PUF's CR-space, where the lattice mapping counteracts machine learning attacks.

PUFs Use-Cases and their Weaknesses
Several PUF-based authentication protocols have been presented. For instance, a delay PUF was deployed as a key generator in an anonymous e-money transaction protocol [27]. In [28], PUF is proposed for a RFID-authentication protocol. Here, the designed protocol requires an extra hash function and the proposed PUF needs to exchange extra data with the server to attain a stable response. In [7], resource-constrained Internet of things (IoT) devices deploying PUF were studied and analyzed. PUFs were introduced as a lightweight solution to ensure secure communications among IoT devices. Unfortunately, the results showed that most of PUF proposals need at least an extra cryptographic algorithm or secure memory to perform the authentication protocols. This need for cryptographic primitives increases the required energy and computational resources for performing PUF-based authentication protocols. In [2], nineteen PUF-based authentication protocols have been studied and analyzed. The results showed that most PUFs are not resilient against modeling attacks, most PUF CR-space requires expanding to be sufficiently large for the entity authentication, and CR-pairs management is very costly due to collision-prone properties of all PUFmappings.

Recent Alternatives to Analog PUFs
The SUC concept as an alternative full-digital substitute to the analog PUF was proposed by the authors a decade ago in the public literature as in [5] and [11]. SUCs, as pure digital structures are ultimately consistent [29]. Additionally, SUCs as designed invertible PRFs offer full usage of the whole 2 n input-output pairs as plaintext-ciphertext pairs, where, n is the cipher input size.
A possible SUCs creation scenario as Feistel-like ciphers was presented recently in [6]. Involutive SUC and non-involutive SUC classes were proposed in [30]. In [31], a new family of stream ciphers deploying non-linear feedback shift registers was proposed as a possible SUC class. In [32], SUC as a random stream cipher based on T-function was proposed. The current paper is presenting new generalized large classes of useful special low-cost self-inverse permutation mappings by deploying pure FPGA resources as arithmetic-cores based on T-functions for SUC constructions.
To show in a very simplified way the differences of our proposal to the somehow digitized analog PUFs with some relation to the SUC concept, the following two examples are depicted from the public literature.
Example 1: In Figure 2(a), the concept of a PUF-based unknown key generation was proposed to construct a strong PUF. In particular, a weak SRAM PUF is used as a key generator for a hardwired (AES) Advanced Encryption Standard cipher [19]. Note that the resulting virtual PUF's input-output mapping is invertible. Therefore, it produces 2 n input-output pairs as plaintext-ciphertext pairs, where, n is the cipher input size. Again, HDA or equivalent techniques are still required to stabilize the noisy responses of the used SRAM PUF. The overall hardware complexity results with quite high complexity [19]. In comparison, our SUC solution on the right-side substitutes all required components with unknown digital structures. No complex fuzzy extractor with limited consistency is required. Therefore, the proposed SUC-structures meet the lightweight design requirements for mass products. Example 2: Figure 2(b) shows a proposed Feistel-like block cipher structure deploying PUFs to create an unknown f function [33]. Note that a block cipher deploying PUFs to create f is equivalent to a block cipher with a secret unknown mapping allocated in the 4 x 4-bit SRAM-PUFs are utilized as unknown confusion functions. The resulting cipher is PRF with a fully usable input-output space, however, again is highly complex due to the use of HDA. The comparable SUC-structure on the right side is equivalent to our proposed SUC alternative as a key alternating cipher using unknown new T-function based arithmetic involutions as round functions as would be shown later on.
Notice that both examples are still using inconsistent analog PUFs with all the above-mentioned drawbacks and disadvantages.

The SUC Concept and Technology Background
The target of this section is to make the paper self-contained and better understandable for the reader, as SUC concept is not widely known in the public literature. This section is a slightly-modified copy of the section 3 from our earlier publication [6] on SUC design techniques.
The unknown cipher concept is an entirely new security paradigm in the public literature. The unknown cipher here does not deal with protecting the communications or the links between at least two parties, as a sender and a receiver, which requires the cipher to be commonly known to both parties (Kerckhoffs's principle). In particular, the SUC is fundamentally designed for the identification process to serve as a clone-resistant identity [5]. We postulate that "unclonability" is only possible if unknown structures are created. Therefore, a cipher designed to be embedded as a structure that is unknown to anybody (including its designer) does not violate Kerckhoffs's principle. On the other hand, SUC should not be confused with "security by obscurity", where the cipher is designed by a cryptographer, known to the manufacturer, and then kept secret and obscure. SUC creation is a very challenging task. Figure 3 illustrates a possible SUC creation concept in a non-volatile (NV) FPGA device having internal self-reconfiguration capability. A large class of ciphers {E1, E2 … Eσ} are first created such that → ∞ and offered for selection. Then, a single-event process triggers the FPGA-internal true random number generator (TRNG), leading to select randomly an unknown cipher choice Ej from the infinite number σ of the created distinct ciphers. A TRNG hardware module is offered in virtually all modern FPGA devices fulfilling the NIST state of the art standard cryptographic requirements (see TRNG-module specifications in the used FPGA [34] in our proposed prototyping). After this process, all the dashed entities in Figure 3 are then irreversibly killed and fully removed from the chip. The self-reconfiguration in the chip is then irreversibly locked (by a flash bit or fuse) to prohibit any repetition of that single-event SUC creation process. That is the created SUC is not more removable or changeable forever like a DNA. This concept was described intensively in the last decade in our old publications [5], [29], and [35].
The resulting cipher is a secret, yet unknown, cipher and is a non-repeatable selection. It is even an unknown choice to the cipher designer/creator himself. The "Secret Unknown Cipher" (SUC) is realizable in an emerging VLSI device that allows self-creation of permanent unknown usable secret structures as "an electronic mutation", as indicated in [36]. Note that for the functionality of the concept, there is no need to publish the SUC creation procedure/program of the cipher class, which is designated from now on as the "GENIE" as a smart cipher designer. However, for worst-case security analysis, we assume that the cipher creating "GENIE" is published.

Creation Concept of Unknown Ciphers as Clone-Resistant Entities/Modules
The proposed SUC is conceptually based on the following principle: "The only secret which can be kept unrevealed is the one which nobody knows" [37]. From a practical point of view, if the cipher creator itself cannot predict and foretell exactly the created cipher, then the cipher is considered as not known when the cipher class size → ∞. Figure 4 illustrates a possible SUC creation that is assumed to be processed in a secure environment. The process may proceed as follows:

SUC creation phase:
1. A trusted authority (TA) injects one-time into a system-on-chip (SoC) device the software package "GENIE" as an SUC creator for a short time (as much time as required to create an unknown cipher, which is usually a few milliseconds). 2. Then, the GENIE is internally triggered to generate/select a permanent and unpredictable secure cipher with the help of an internal, non-repeatable, unpredictable, and unknown bit stream from the in-chip TRNG. 3. After creating an SUC, the GENIE is completely and irreversibly deleted. What remains is a non-removable, unchangeable and unknown operational cipher (as SUC) that nobody knows.

SUC personalization phase:
4. TA randomly selects a set , … of cleartext vectors out of the 2 n possible combinations, where n is the size of the SUC input/output space in bits. 5. TA stimulates the SoC device to encipher the cleartext vectors into the ciphertexts {y1, … yT} using its SUC within the device. 6. The resulting T-(xi ,yi) pairs are stored as secret pairs in the individual (personal) device records by the TA. The records should be kept secret for later use. As the created TRNG bits are fully and exclusively responsibly for creating the SUC, and as TRNG bits are unpredictable, non-repeatable, and unknown, the resulting created SUC in the SoC device is also unknown and unpredictable, such that: Every t > 0. This implies that, where n is the bit size of the SUC input/output space and kt is the bit size of the cipher's secret key. Thus, the maximum number of distinct possible permutations is 2 ! n σ < as 2 ! n is the number of all possible {0,1} n to {0,1} n permutations. Therefore, in that case the number of possible selectable block ciphers of block size n is, In addition, SUC has the property of being able to generate a large number of distinct CRPs as cleartext/ciphertext pairs, which is upper bounded by 2 n . This counteracts the lack of CR space in the case of traditional analog PUFs.
The created cipher SUCt is a result of the TRNGt random sequence that is not known to anyone. Moreover, it is highly probable that for any two-time points t1 and t2, Therefore, each resulting SoC device has its individual SUC with a probability .
How to Use an SUC? Figure 5 shows a generic two-way identification protocol using such SUCs for authenticating a personalized SoCA device.
An SUC-based identification protocol may proceed as follows: 1. A secret pair (xi ,yi) is randomly chosen from the TA's secret records of SoCA. Then, the TA challenges the SoCA device by the cryptogram yi over an insecure channel. 2. The SoCA device responds by sending the decrypted cleartext x'i. 3. If x'i = xi, then the SoCA device is deemed to be authentic, and the pair (xi ,yi) is then marked as a used pair and never used again avoiding replay attack for highest security. Refined versions of that protocol are further developed as shown in [38]. It is shown that much more efficient and low-cost CR-pairs management is possible due to the invertibility property of the SUC (with no collision as a cipher) compared to the collision-prone properties of all PUF-mappings (as somehow unknown hash-functions).

SUC Application Spectrum
The concept of SUCs as clone-resistant entity offers new attractive spectrum of applications. In [39], basic identification and usage scenarios are presented. In [40], a use case for securing IP-cores in FPGA environment is introduced. Recently, the authors published several IoT use cases for SUCs attaining very efficient management for secured remote sensing [41]. A Fleet Management System (FMS) for secured logistic operations is presented in [42]. In [43], a novel concept for electronic wallets for e-cash is presented.

Targeted SUCs Realization in Non-Volatile VLSI-FPGA Environment
Microsemi is the only provider of non-volatile FPGA technology with flash-based distributed switching fabrics and programmable cells. One of the advanced non-volatile SoC FPGAs from this ( ) The flash-based FPGA fabric incorporates an integrated ARM Cortex-M3 processor together with powerful arithmetic units MACC and high-performance communication interfaces all integrated in a single chip. The infrastructure is smart enough to allow a GENIE program to create a cipher within the SoC unit. However, self-reconfiguration and irreversible reconfiguration-locking is still not possible in such non-volatile devices to enable self-creation of permanent unknown "hard-wired" structures as SUCs. Enabling self-reconfiguration is available in RAM-based FPGA technologies are expected to emerge similarly in the flash-based non-volatile technology in the near future. Assuming that self-reconfiguration would be possible in the future, designers can devise mechanisms for SUC-creation in such devices. The greatest technology challenges in self-creating of SUCs can be summarized in the following two categories: 1. Designing a GENIE program as a "smart VLSI-designer" which can extend an existing FPGA design without violating the technology design rules. 2. Designing a GENIE which can serve as an obedient "smart cipher creator" to fulfill all necessary security requirements to come-up with a really unknown and unpredictable cipher (SUC). The above two challenge-categories are technically highly hard to be practically-realized in an ultimate way. However, there are no technical reasons to believe that the SUC creation as a concept may become an impossible mission. Our proposed cipher concept in this paper shows one possible concept targeting to address both challenges in a promising and practically realizable approach.
The main objective of this work was to approach strategies towards making the targets technically possible at low area cost and low GENIE-complexity in time and memory. The particularly selected strategies in this work are: 1. Using unconsumed FPGA resources. In this case the p hardwired arithmetic addition and multiplication cores. 2. To optimally deploy the technology resources in hiding the SUC structure's keys and function parameters. Figure 6 shows a possible scenario for an incremental embodiment "mutation" of an SUC in Microsemi SmartFusion®2 SoC FPGA. In this scenario, the GENIE should use only resources outside the "functional HW-Cores". That is use mainly not used "free Mathblocks" resources, "free FPGA fabric" and "free NV memory". The result is a distributed SUC hardware and software components at possibly individual locations in each mutated device.  The existing DSP blocks, also known as Mathblocks (MACCs), include multipliers are optimized for example to be efficiently configured to perform 18x18 or double 9x9 multiplications allowing implementation of scalable arithmetic. To avoid complex modular arithmetic, the ring of integers arithmetic modulo 2 n is adopted as the basic cipher algebra in 2 n  .
Each "Mathblock" offers the following capabilities [34]: Built-in addition, subtraction, and accumulation units to combine multiplication results efficiently. A multi-precision arithmetic can even be realized in hardware to reach easily a cipher block size n of 64 bits.
Finally, the SoC units should able to be irreversibly prohibit reconfiguration after creating SUC by irreversibly setting some last-fuse hardware-lock.
Based on these existing FPGA resources, the authors consider the cases where many real applications do not consume all the available arithmetic cores. This work is, therefore seeking ciphering classes using mainly simple multiply, add and subtract arithmetic in 2 n  .
The requirements on the cipher design are therefore consequently: 1. Designing huge classes of self-inverse permutation functions modulo 2 n using multiplication as a major ciphering function. 2. Creating cascades of such permutations to create powerful (SUCs) to serve as "SUC-PUFs".

Preliminaries on Crypto-Permutations
This section contains a short introduction to permutations for cipher design. Basic mathematical concepts and lemmas with proofs are introduced to be used throughout this work.

Early Work on Permutation for Ciphering Stages
In cryptography, it is known that confusion refers to providing high degree of non-linearity between plaintext and ciphertext of a cipher, while diffusion refers to the effect of each plaintext-and key-bit on each bit of the ciphertext. Furthermore, diffusion and confusion are considered as the two essential ciphering requirements that inhibit statistical analysis of the input-output pairs behavior. However, a substitution box (S-Box) is usually deployed as a confusion component in the modern cipher. The diffusion component (layer) is widely deployed in a different form as permutation mapping. For instance, a diffusion component is implemented as a fixed wired permutation [44] in designing ultra-lightweight cipher, so-called (PRESENT). In [45], a diffusion wired is configured as a permutation of eight 4-bit words to the design a light-weight cipher. In addition, special matrices were also used as permutation functions in the diffusion layer such as AES cipher [46]. Recently, Yunwen et al. [47] proposed two types of nonlinear functions as alternative diffusing components. The first type relies on a nonlinear code, which is known as a Kerdock code and the second type relies on the T-functions.
On the other hands, Rivest [48] introduced the exact conditions required to find permutation polynomials modulo 2 n . Moreover, Singh et al. [49] generalized the conditions for permutation polynomials over n p  , where p is a prime integer. In [50], the essential conditions for creating selfinverse permutation polynomials of degree 2 over the ring were determined. Another application of permutation polynomial that investigates the permutation polynomial-based interleaves over integer rings for turbo codes is presented in [51]. A quadratic permutation polynomial and its inverse over 2 n  is presented in [52] and [53]. Furthermore, an interesting work by Klimov [54], [55] introduced the bit-slice analysis method to create the so-called T-functions. Using this method, Klimov et al. [54] proposed more generalized polynomial structures with integer coefficients using (+), (-) and the Boolean Operators over 2 n . They obtained the necessary and sufficient conditions for such polynomial structures that produce permutation functions.

Permutations
Let R be a finite ring. It is well known that not every function : If a polynomial function f is a one-to-one mapping, then it is called a permutation polynomial (PP) over R.
For instance, a polynomial , if a is odd and b is even.
Definition 5.1: Self-inverse function or an involution is a function f that its own inverse for all x, The following theorem gives the conditions for QPP to be Self-inverse QPP.

Theorem 5.2: Let n>2 be an integer and let
only if the following conditions hold [50]: 1.

If n is even, then
3. If n is odd, then and v a unit in 2 n  .
In cryptographic applications, self-inverse permutations (Involutions) are preferably used in the rounds of block ciphers to attain low implementation complexity. Unfortunately, each permutation polynomial from the previous class does not have the required strong cryptographic properties. In fact, there is a weakness in that at least two fixed points exist in the resulting permutation for every self-inverse permutation polynomial over 2 n  . This weakness of fixed points of the previous class of self-inverse polynomials is proved in the following lemma: Lemma 5.3: Any 2 ( ) P x ax bx = + satisfying the conditions of Theorem 5.2, has exactly two fixed points.
Proof: It is very clear that 0 x = is a first fixed point for every self-inverse permutation polynomial of the previous class.
Let n be an even number (odd cases can be proved similarly). For To prove that every self-inverse permutation polynomial of the previous class has no more than the two fixed points x=0,2 n-1 , suppose that x0≠0,2 n-1 is a fixed point. Then, After substitution, Multiplying both sides by 2 results with, Applying mod 2 n on both sides results in, is not an invertible element over 2 n  . Therefore, x0=0 contradicts with the assumption x0≠0,2 n-1 . As a result, every selfinverse permutation polynomial of the previous class has no more than just the above given two fixed points x=0,2 n-1 , Q.E.D.

T-Function Principles
In [54], Klimov and Shamir introduced a new class of low-complexity functions, so-called Triangular-Functions (T-function), which are invertible and exhibiting special cryptographic properties. For 2 n x ∈  , a binary representation of x as n-bit vectors is given as follows, x ∈  for every 0 i n ≤ <

Definition 5.2:
A function ( ) f x from an n-bit input to an n-bit output with the property that the i th bit of its output depends only on the first, the second... and the i th bit of its inputs is called a T-function (short for triangular function). Eight basic possible constructing operations of T-functions were introduced as [52]: where, a and b are two n-bit words.
The following lemma gives an abstract bit-slicing representation of the arithmetic and logic operations, where, α's denotes a parameter [56].
In [54], Klimov which holds for any bit i and α is a parameter" [56]. Finding the inverse of a T-function is not straightforward and not known in general form.
The main target of the subsequent sections is to construct T-functions with SIQPF properties deploying the operations (+), (-), and ⊕ . Therefore, it is necessary to determine the required cryptographically relevant conditions on

New Classes of Self-Inverse Permutations
In the following, it is shown how to construct QPPs deploying +, -, and ⊕ operations to come up with SIQPFs.
are said to be Quadratic Permutation Functions (QPFs) over 2 n  , if they permute all the elements of 2 n  .

Definition 6.2:
A self-inverse function is a function f over 2 n  , such that, where x is an n-bit word. Choosing the coefficients a, b according to Theorem 5.2, a, b can be represented as n-bit vectors as follows: where 2 n r ≥ if n is even and if n is odd. The proof is very simple. It is based on the definitions of the multiplication and the modulus over 2 n  .
Based on the previous lemma, the following theorem is proved: Theorem 6.3: Let n >2 be an integer. The permutation function

For n odd,
Proof: First, it is required to prove that: If n is even, then 2 ( ) P x ax bx = + is a SIQPF over 2 n  , where  And the third term: ( ( ))mod2 n g g x x = .
That implies, ( ( ))mod2 n P P x x = for any integer n, Q.E.D.
Unfortunately, the ( ) ) for every SIQPF. A remedy for that weakness is proposed in this work to remove the specific constant fixed points in Theorem 6.3. This is attained similarly as in [55] by deploying the Boolean operator (OR) as proved in the following theorem.

D is an integer number. Moreover, the resulting two fixed points of any P(x) are distinct and different for each individual SIQPF.
Proof: Let us prove that where, First, we check the case of 0 /2 i n < < : where, where, Technically, from Equation (12), (13), (14) and Definition 5.3, the function ( ) ( ) f x ax b g x = + ⋅ can be represented as, which holds for any bit i and α is a parameter. Therefore, It is very simple to show that which implies, The last step is only correct, if and only if 1 2 D D = , which proves that the resulting fixed points of any P(x) are distinct and different for each individual SIQPF, Q.E.D.

Practical significance of the SIQPFs of Theorem 6.5:
The fact that the resulting SIQPF according to the construction in Theorem 6.5, results with two individual and different fixed points for each different SIQPF is advantageous for cryptographic applications. The reason is that, ciphering operations involve usually cascading many different SIQPFs as round functions (see cipher structure in Section 9 and Figure 8). Therefore, the dynamic distribution of the different fixed points for different SIQPFs in different cascading stages results in general with improved random diffusion property of the overall cipher permutation.
Extending the ( ) P x class of Equation (9)

D is an integer number. Moreover, the resulting two fixed points of any P(x) are distinct and different for each individual SIQPF.
Proof: is similar to that of Theorem 6.5.

Cardinality of Proposed SIQPF Classes
In this section, the cardinality of the SIQPF classes are evaluated. Moreover, the equivalent and distinct mappings of the permutation polynomials are identified.
In 2 n  ,not all permutations can be generated by polynomials and every permutation may be generated by different polynomials which are called equivalent polynomials modulo 2 n . Therefore, computing the number of distinct polynomial permutations over 2 n  , requires excluding equivalent cases. Let Pn be a set of all possible permutation polynomials resulting with distinct permutations over 2 n  . Keller et al. [57] presented a formula to determine the cardinality of Pn. The cardinality of the set of all polynomial functions over different rings with some special conditions is presented in [58]. The formula which determines the cardinality of Pn is given in [57] and [59] as follows: Using the fact, that the sum of two even or odd integers is always even and according to theorem 5.1, implies: In other words, the resulting polynomials generates the same permutation. Note that according to the above lemma, the number of permutation polynomials N0 may include some equivalent permutation polynomials modulo 2 n . The following definition appears to be useful for the targeted evaluation.

Definition 7.2:
The cardinality of the set of all equivalent permutation polynomials modulo 2 n with degree d ≤ 2 n -1, is equal to the numbers of all possible permutation polynomials N0 having the degree d ≤ 2 n -1 excluding all distinct permutation polynomials | Pn |. That is: Table 1 shows the number of equivalent permutation polynomials modulo 2 n of degree at most 2 n -1 for few selected small values of n. It is noticed that for even small n=8 results with a huge number of equivalent permutation polynomials. Therefore, it seems useful to seek an upper bound for the degree d of all distinct permutation polynomials. The following upper bound on d can be derived by making use of (16) for ( 0 | | n N P = ), resulting with: where the degree  d is the upper bound of the degree of distinct PPs in size n. The values of n, Pn are known and ⋅     is the ceiling function. The formula in Equation (19) represents a necessary design rule for selecting such distinct permutation polynomials. Table 2 shows the relation between the cardinality of PPs and the corresponding highest degree d of non-equivalent PPs over 2 n  . As all practical applications require n >3 for 2 n  , and all proposed new SIQPFs classes have degree 2 then all resulting PPs in any class are distinct. In the following, the cardinality of each distinct class of the new SIQPFs is computed: Proof: For n even, and from Theorem 6.3, the following is true:   This implies, For n odd and applying similar steps, that implies: Table 3 shows the corresponding cardinalities for all new SIQPFs classes, where the procedure of corollary 7.2 is repeatedly applied for each class. For a practical example of 64-bit arithmetic where n=64, the cardinalities of the permutation classes |C1|= 2 33 , |C2|= 2 65 , |C3|= 2 97 and |C4|= 2 129 . Table 3. Cardinality of all resulting classes of Self-Inverting Permutation Functions (SIPFs). Notice that the given cardinalities in table 3 represent worst case bounds. The exact cardinalities seem to be difficult to evaluate as equal mapping may happen in different mappings constellations. Therefore, the smallest cardinality values are used when evaluating the resulting cipher performance and cardinalities.

Hardware and Complexity Evaluation of SUC Rounds
To implement SQIPFs in the targeted FPGA platform, fabric Look-Up Tables (LUTs), D-Flip Flops (DFFs), and Mathblocks are required. Therefore, an optimal and effective implementation strategy proposes to attain the same number of LUTs and DFFs, i.e. the ratio / = # # should be close to 1. This can be inferred from the fact that when consuming an LUT, its corresponding DFF in the same logic cell cannot be used elsewhere as its input is used for the LUT. Most FPGA architectures provide an easy to connect DFF within each LUT.
The implementation complexity of such classes in SmartFusion®2 SoC FPGA is one of the major objectives of this research. Therefore, a sample complexity evaluation should show the relative efficiency of the designed permutations.
These classes of SIQPFs could be implemented in both hardware and software or by combining HW/SW implementation scenario for the targeted SUCs described in [29]. In this case, the SIQPF could constitute one efficient class required for constructing the SUC cascade. The cryptographic strength (attack complexity) of the generated permutations is attained through a huge number of possibilities of each SIQPF class simply controlled by the permutation function coefficients. Figure 7 shows a basic hardware configuration for building the function 2 a bx cx + + for n=18 bits by using 2 Mathblocks. The designed SIQPFs are modeled in VHDL and synthesized to check their hardware complexity and performance. ModelSim ME package is used for simulation and Synplify pro ME within Libero SoC is used for synthesis. When implementing these functions, the Multipliers (MACC), LUT and DFF constitute the basic resources for implementing such functions. By analyzing the FPGA resource usage for each function, a closed formula was found for the number of MACCs, (NMACC), LUTs (N4LUT) and DFFs (NDFF) for each class of SIQPFs and input data size in bits. Table 4 shows the required hardware resources (hardware complexity) as a function of the number of bits n for 1 ≤ n ≤ 32 bits for the permutation classes C1 and C2. The complexities of the permutation classes C3 and C3 are slightly higher and are not included in this evaluation.  Here U is a unit step function defined as

R LUT/DFF, n=32
a bx cx + + exhibits the highest efficiency as it makes maximum use out of the same deployed number of MACCs.

Software Complexity
SmartFusion®2 SoC FPGA incorporates ARM Cortex-M3 that supports Thumb2 instruction set, it contains enhanced instructions as single cycle multiplication between two numbers of 32 bits. Table  6 shows the time and memory implementation complexities of the same set of permutation functions when using ARM Cortex software environment for some chosen numbers of bits n.

Proposed New SUC Constructions Based on Self-Inverse Permutations
To generate larger classes of ciphers, cascading one or more SIQPFs is necessary. This is also the first step toward creating SUCs with cryptographically significant entropy.

Possible Creation of an SUC as a Key-Alternating Cascade of SIQPFs
In [56], the permutation T-function f used to construct unusual permutations by Xoring of any pair of f(x), x as ( ) f x x ⊕ . Furthermore, extending the resulting SIQPF's classes by using this result, for example, with 2 ( ) P x a bx cx = + ⊕ as a SIQPF, the XOR of any pairs of P(x), x is a permutation ( ) P x x ⊕ but not necessarily a SIQPF. One of the most important results based on this discussion is in deploying them in counteracting the weakness stated in Theorem 6.3, 6.4, 6.5 and 6.6. For example, let 2 ( ) 3 2 P x x x = + be a SIQPF in 3 2  and let ( ) 3 G x x = ⊕ be an XOR mapping where the bitwise XORing operation with any given value represents an involution. Therefore, the resulting function composition ⊕ is SIQPF without fixed points. However, the previous discussion proves that XORing a SIQPF with the key k results with in a round function ( ) P x k ⊕ for a block cipher which avoids fixed points, where k is a round key.

Cardinality of the GENIE-Selectable SUCs
This section investigates the cardinality of the resulting cipher when using the key-alternating cipher structure of Figure 8. The cipher has r randomly selected self-inverse permutations 1 ,..., r P P and ( 1) r + randomly selected n-bit keys.
The usable self-inverse permutations are included in the above 4 different classes 1 2 3 4 , , ,and C C C C from which 1 ,..., r P P can be randomly selected.

PERMUTATION CLASSES
SUC: as unknown Key-alternating cipher Notice that, for highest security and to avoid fixed points in the total mapping, at least one permutation needs to be selected from the classes C3 and/or C4. We select few possible random cipher selection strategies to evaluate the cardinality of all possible selectable SUCs.
A. Fully selecting from the set containing all 4 classes In this case, we consider the selectable mappings, in Figure 8, to be out of the set of all 4 classes As each cipher utilizes ( 1) r + keys of size n, the cardinality of keys is: The GENIE selects r-mappings randomly from S, hence there is r! possible placements of the mappings to build each cipher.
The GENIE selects randomly r mappings from the classes of mappings S and ( 1) r n + × key bits.
The placement of the selected mappings is totally random. We investigate the following two selection cases.  In the second case, we consider for lower hardware complexity that the GENIE is designing the SUC cipher by deploying a fixed number of selections for i P from each class i C , the realization of such SUCs is according to the SUC-Design-Template mechanism proposed in [30]. Since there exist 4 classes, we consider that the cipher-structure has a fixed template of i P selections as follows: t mappings ( i P s) from 4 C Note that 1 2 3 4 t t t t r + + + = . ( 1) 21 If the selection is done without repetition, then the cardinality 22 σ of this cipher ( Figure 8) is: The main advantage of deployed involutive (self-inverse) cascaded permutations (as SIQPFs) is that the same mapping modules can be used for both encryption and decryption operations by just reversing the sequence of the mappings with their round keys. Therefore, for a larger ij σ more cloning security is attained. Notice that a larger ij σ mostly requires more hardware complexity.

Security Evaluation of the Resulting SUCs
The security level (or bound) of any cipher is conventionally determined by applying Kerckhoffs's principle. That is the attacker knows all details of the used cipher structure except the cipher-key which is unknown to the attacker. As in SUC concept, the cipher is even not known to anybody, the attack complexity is basically expected to increased [61]. However, the cipher structure may be predicted if the GENIE is published. Otherwise, if the GENIE is not published (this is allowed in the proposed realization concept), the attack complexity is expected to increase.
A well-known interpolation attack would successfully reveal a mapping y equivalent to the whole SIQPF cascade if and only if SUC is given as a cascade of only algebraic classes of SIQPF such as 2 ( ) P x a bx cx = ± ± . In this case, an adversary can compute such equivalent mapping y to all r rounds by just 2r+1 known plaintext/ciphertext pairs [62] and [63] as follows: Note that an adversary needs just to know that the functions as a quadratic one and guess the number of rounds r.
In another case, the interpolation attack is not applicable, when the selection of SIQPFs is drawn from non-algebraic classes such as However, Klimov et al. [55] presented an attack scenario on T-function , when it is as a substitute for LFSR in a stream cipher. This attack works, for example, only if the size of C is small such as 3 n , and then it requires There is no such attack on a block cipher that uses a T-function as round function.

Modeling Attack on the Proposed SUC
When looking at modeling attacks on SUCs there are two possibilities: Firstly, the target of an adversary using ML is to create a predictive model of an SUC by analyzing some training data. Theoretically, if an SUC is a weak Pseudorandom Function (PRF), then certain patterns of plaintext/ciphertext pairs could be easily identified and detected by a ML algorithm with little training. But when an SUC is a secure PRF, the successful detection of patterns becomes impossible. Moreover, if a designed SUC is a secure PRF, then there is no ML algorithm that can build a predictive model for such an SUC, because the secure PRF concept postulates that the output of PRF is statistically independent of training data and uncorrelated with any learner [64].
The second possible modeling attack is to store all the possible plaintext/ciphertext pairs as the Cipher Codebook size CCBS=2 n . However, storing 2 n bits to build a model for an SUC is infeasible for ciphers with n>80.
In this section, the focus is put on the adversary who tries to use the collected SUC-input/output pairs in distinguishing attacks. Here, successful distinguishing attacks on SUCs indicate that the designed SUC structure is vulnerable. Therefore, sooner or later the adversary can build a predictive model for the designed SUC. If not, modeling attacks on SUCs are almost infeasible.
As a result of the above notes, definitions, and discussion the self-generated SUC inside a chip can be modeled as a secure Pseudorandom Permutation (PRP) chosen randomly from the class of all possible generated ciphers { } where, n and k are the input-output size and the key size, respectively. The inverse of SUC should be a secure PRP as well: . In [68], the tight security bound 1 2 r n r+ of the distinguishing attack on a keyalternating cipher is proved, i.e., there is no more security bound that can be attained.
It is, therefore, conjectured, that the security bound ( ) 2 n ε is enhanced by a factor equal to the product of the cardinalities of the deployed SIQPFs, where ε is a function of r . An ongoing research is conducted to answer this still open question. However, the proposed SUC cipher design fulfills, therefore, at least the state-of-the-art security requirements for standard good ciphers. In the following section some statistical properties of SIQPFs are investigated. Here, the statistical properties provide an initial proof of the indistinguishability of the proposed SUCs.

Statistical Properties of the Resulting SUC
In this section, the diffusion [46] and a frequency prediction [69] as statistical properties of SIQPFs are analyzed and studied .

A. SUC Diffusion Properties
The essential definition of a diffusion is to determine the number of changed output bits when one input bit has been changed. Ideally, the changing ratio of the output bits is 50%. However, a Tfunction is defined as a mapping in which bit i of the output depends on 0, 1, …, i bits of the inputs [54], thus indicating that changing the first input bit affects all n output bits, changing the second input bit affects the last n-1 output bits, etc. The changing of last input bit affects the last output bit. To test this property, the hamming distance between outputs of randomly selected SIQPF by changing one input bit every time. The applicable Algorithm.1 is defined as a simulator to determine the amount of diffusion.
Return the average of A.
The results in Figure 9 show that the increasing of number of rounds in Equation (21) does not change the statistical distribution of the diffusion of any resulting cipher in the class. In this case, the simulation indicates that the average of diffusion is close to 50% for repeatedly using a single SIPQF with r (iterations) rounds. To test this, 10000 random input/output pairs are used for selected SIPQF, where the predication of each output bit yi is given based on statistical distribution of [ 1] i P y = . This procedure has been performed and repeated for 10 randomly selected SIPQF. Results in Figure 10 show a high unpredictability of the bit output yi, while [ 1] i P y = is close to 50%. bit Output yi Figure 10. The probability that any output bit is 1.

Conclusions
In this paper, new classes of low-complexity self-inverse permutations based on T-Functions (SIPFs) are presented. The target usage of such functions is for creating the so-called Secret Unknown Ciphers (SUC) at very low cost to serve as clone resistant digital PUFs. The permutations algebra is deploying optimized usage of multiplier/Mathblocks units which are often not consumed in many FPGA applications. As a result, high quality ciphers may be embedded with negligible cost by using unconsumed resources in such SoC units. The resulting new cipher classes based on 2 n  arithmetic are very promising in their security and quality. The resulting SUC structures are designed to be used as PUFs alternative and security anchors in the emerging future smart application scenarios. Creating SUCs is a very challenging task. This work is a part of an ongoing basic research towards creating and embedding SUCs in future VLSI-technologies and showing their possible applications. Being created in a manufacturer-independent-process and by end-users, the SUC technology is expected to be attractive for wide spectrum of applications in future automotive and IoT environment.

Conflicts of Interest:
The authors declare no conflict of interest.