A Novel Audio Encryption Algorithm with Permutation-Substitution Architecture

: In this paper, a new cryptographic method is proposed, designed for audio ﬁles’ security. The encryption algorithm is based on classic symmetric models using pseudo-random number generator composed with chaotic circle map and modiﬁed rotation equations. The scheme of a new pseudo-random generator is presented and used as basis for chaotic bit-level permutations and substitutions applied to audio ﬁles structure for successful encryption. The audio encryption and decryption algorithms are described and explained. Proving the high level of security we provide extensive cryptographic analysis including key sensitivity analysis, key-space analysis, waveform and spectrogram analysis, correlation analysis, number of sample change rate analysis, level of noise analysis and speed test.


Introduction
The cryptography in general is an art of secret transferring information from sender to receiver or a group of receivers. The modern technologies changed the way information is being sent simply because the information is now digital in a form of bits transferred in computer networks around the world. This factor requires standard cryptography algorithms used before digital ages to be applied and/or modernized to work with digital information. In this paper, we present a new method for encryption designed for audio files security in order to safely store and transfer this specific type of files. The other important aspect of cryptology is the cryptographic analysis with the main purpose to reveal the encrypted messages. The cryptographic analysis often uses different kind of empirical experiments that can be used to establish if any proposed algorithm has the necessary level of security or to prove that the algorithm is not secure enough. In this paper, we provide extensive cryptographic analysis for confirming the level of security of the proposed audio encryption scheme.
Previous research in this area has shown that the use of chaotic maps for construction audio encryption algorithms leads to high levels of security. In [1] Liu, Kadir and Li proposed encryption scheme using confusion and diffusion based on multi-scroll chaotic system. Hato and Shihab performed evaluation of Lorenz and Rossler chaotic system for speech signal encryption in [2]. More valuable research is presented in [3], where Sathiyamurthi and Ramakrishnan used Bernoulli's chaotic map for constructing encryption algorithm. Tamimi and Abdalla provide one more similar approach for audio shuffle-encryption algorithm in [4].
Important research [5] demonstrates that using a single chaotic map is not always a guarantee for highest levels of cryptographic security. The research overviews how encryption algorithms, using Arnold's Cat Map, Baker's Map or Two-Dimensional Logistic Chaotic Map may be vulnerable. This is one of the reasons we are using combination of two chaotic maps in order to extend the key-space (all the possible values for secret keys) for additional cryptographic security.
Considering the previous experience in this area we constructed a novel audio encryption algorithm with permutation-substitution architecture and compared the results with other algorithms.

Pseudo-Random Generator Based on Circle Map and Modified Rotation Equations
The pseudo-random generators (PRG) are software designed tools designed to provide endless sequence of random bits. PRGs are often used as main resource for symmetric cryptographic algorithms by using the random bits for encryption and decryption of digital files. Chaotic maps are widely preferred for constructing pseudo-random generators because of their chaotic behavior [6][7][8].

Circle Map
The Circle map is one dimensional dynamical system often used in cryptography because of its chaotic behavior. Examples of PRGs based on circle map are proposed in [9,10]. The iterations of the standard circle map are calculated by: where Ω is a fixed constant playing the role of polar angle of the sinusoidal oscillator, and K is the coupling strength. The initial values we used for our scheme are: θ 0 = −0.25, Ω = 0.7128281828459045, K = 0.5. The values of the constants are chosen by considering the results of the experiments in our previous work of constructing PRGs based on circle map [9].

Modified Rotation Equations
A Modified form of rotation equation, presented in [11,12] is rarely used in encryption algorithms, but in our previous work we have determined the good chaotic properties of the Modified rotation equations [13]. The formula we used is given by: where the parameters for chaotic behavior are θ = 2 and a = 2.8. The initial values we used are x 0 = 0.2343214592 and y = −0.742190593. The plots of 5000 points with the initial values are shown in Figure 1.

Pseudo-Random Generation Algorithm
The chaotic formulas presented in Sections 2.1 and 2.2 are used for pseudo-random binary sequence generation by following the next steps: Step 1: The initial values θ, Ω and K from Equation (1) are determined.
Step 2: The initial values θ, a, x 0 and y 0 from Equation (2) are determined.
Step 3: The two chaotic maps from Equations (1) and (2) are iterated for M times, without extracting any results. This step is for additional security and extending the secret key.
Step 4: The current iteration of Equation (1) is used for obtaining θ and its decimal value is post-processed as follows: where integer(x) returns the integer part of x, truncating the value at the decimal point, and mod(x, y) returns the reminder after division.
Step 5: The current iteration of Equation (2) is used for obtaining y and its decimal value is post-processed as follows: where integer(x) returns the integer part of x, truncating the value at the decimal point, and mod(x, y) returns the reminder after division.
Step 6: Calculated values for s i and j i are combined with XOR operation to get a single output bit.
Step 7: Return to Step 4 until the necessary bit stream is reached.

Statistical Results
Using pseudo-random binary streams for cryptographic algorithms requires extended statistical analysis to ensure the randomness of the stream [14,15]. The PRG described in the previous section is tested by generating 1 billion bits and the binary sequence was subjected to the following analysis:

NIST-Test
NIST-Statistical Test Suite [16] is one of the most used software when the PRGs are concern. The NIST package evaluates the randomness by performing 17 tests over given binary sequence. The actual evaluation is made by dividing the 1,000,000,000 bits into 1000 subsequences with 1,000,000 bits. The results for all tests are presented in Table 1. The minimum pass rate for each statistical test with the exception of the random excursion (variant) test is approximately = 980 for a sample size = 1000 binary sequences. The minimum pass rate for the random excursion (variant) test is approximately = 603 for a sample size = 617 binary sequences. All the NIST tests have p-values in the acceptable range [0,1) indicating that all tests are passed.

DIEHARD-Test
DIEHARD software contains 19 test for randomness and we applied the tests again over the same bitstream of 1,000,000,000 bits provided by our PRG. The acceptable range for passing the individual tests once again is [0, 1) for calculated p-values [17]. The results for all tests are presented in Table 2. All the tests in Table 2 have p-values within the desired range meaning all the tests for randomness are passed.

ENT-Test
The last statistical test software we used for randomness determination is ENT [18]. The tests are 6-Entropy, Optimum compression, χ 2 distribution, Arithmetic mean value, Monte Carlo π estimation, and Serial correlation coefficient. The results are presented in Table 3.

Key Space Analysis
The key space is defined by initial values from Equations (1) and (2) and represents all the possibilities that can be used as a secret key for encryption. The parameters from the equations are not included in key space because their values do not change, so the included variables are θ 0 from Equation (1), x 0 and y 0 from Equation (2) and M from the proposed algorithm. Considering IEEE floating-point standard [19] the precision of 64bits double variables is about 10 −15 . In our case we have three double variables so the final key space is about 10 45 ≈ 2 149 . This is large enough to resist against brute force attack methods [20].

Key Sensitivity Analysis
Key sensitivity test requires using very similar keys and comparing the results of the produced random sequences. To compare the bit streams produced by our PRG we used five different but very similar keys. For the first key (K1) we used the initial values from Sections 2.1 and 2.2. For the second key (K2)-θ 0 is changed to 0.251, for K3-θ 0 is 0.249, for K4-x 0 is 0.2343214593 and for K5-y 0 is −0.742190594.

Audio Encryption Algorithm with Permutation-Substitution Architecture
In this section, we present an audio encryption/decryption algorithm using the presented pseudo-random generator described in Section 2.3 and previous research [21]. Considering the audio file structure our algorithm leaves the header bits intact and process only the audio data part of the files. For further empirical experiment header bits of the audio files are not modified because they contain information about the file size, number of samples, bits per sample etc. The audio data part is divided into samples with digital value representing the actual sound signal. Our scheme processes the samples by shifting the bits in every sample, performing permutation and changing the values of the bits in the sample performing substitution.

Encryption Algorithm
The audio encryption algorithm consists of the following steps: Step 1: The initial values from Equations (1) and (2) are determined then the PRG is iterated M times.
Step 2: The header bits from plain audio file A are transferred into file A without cryptographic modifications.
Step 3: The audio data from file A is processed sample by sample.
Step 4: S bits are extracted from the proposed PRG, where S is the number of bits in the sample. The bits are converted into integer value-S 1 Step 5: Integer number P is calculated as follows: P = S 1 moduloS.
Step 6: The bits from current sample are shifted with the obtained value P from the previous step.
Step 7: The bits in the result sample are modified using XOR operation with the same amount of bits produced by the proposed PRG.
Step 8: The encrypted sample from Step 7 is transferred into file A .
Step 9: Repeat Steps 4-8 until end of plain file A is reached.
Step 10: The produced output file A is the final encrypted audio file.

Decryption Algorithm
The description method needs to consider the linear bits output of the PRG, but the opposite order of the bit-shifting and bit-modification in every sample. The decryption steps are: Step 1: The initial values from Equations (1) and (2) are determined then the PRG is iterated M times.
Step 2: The header bits from encrypted audio file A are transferred into file A without cryptographic modifications.
Step 3: The audio data from file A is processed sample by sample.
Step 4: S bits are extracted from the proposed PRG, where S is the number of bits in the sample. The bits are converted into integer value-S 1 Step 5: Integer number P is calculated as follows: P = S 1 moduloS.
Step 6: The bits in the result sample are modified using XOR operation with the same amount of bits produced by the proposed PRG.
Step 7: The bits from current sample are shifted back with the obtained value P from the step 5.
Step 8: The result sample from Step 7 is transferred into file A .
Step 9: Repeat Steps 4-8 until end of plain file A is reached.
Step 10: The produced output file A is the final decrypted audio file. Figure 4 illustrates the decryption process.

Encrypted audio file audio
Decrypted file Decryption Process

Header information
Sample N PRG substitution permutation The encryption and the decryption methods are implemented using programming language C++ for further evaluation by performing extended cryptographic analysis presented in the next section.

Cryptographic Analysis
The main purpose of the cryptographic analysis is restoring the plain message from the encrypted message. In this section, in order to prove the audio encryption efficiency, we performed various empirical tests to compare plain files and their corresponding encrypted files.

Waveform Plotting
One of the most common approaches, concerning audio signal analysis is waveform plotting to display the audio signal amplitude distributed in time. To compare the plain audio files with the encrypted ones we present the visualization of one of the tested files. Figure 5a represents the waveform of normal file before encryption, Figure 5b represents the changes in the file after encryption and Figure 5c demonstrates the restored file after decryption.
The difference between the plain file plot and the encrypted file plot is indication of successful encryption. Furthermore, the strong difference also means the original file cannot be restored even partially.

Spectrogram Plotting
The spectrogram plotting is another important approach for analyzing audio signals. In this case the main focus is the frequency of the sound against time domain. Comparing plain files with encrypted files allows us to see the difference between the files and to evaluate the proposed audio encryption algorithm. Figure 6a shows the spectrogram of plain file, Figure 6b represents the changes in the file after encryption and Figure 6c demonstrates the restored file after decryption. The spectrogram plot of the encrypted file means the frequency of the original signal in the plain file is completely destroyed. This test is another indicator of the high encryption properties of the proposed audio encryption algorithm.

Histogram Analysis
The histograms are common tool to measure the distribution of values. Evaluating audio signal with histogram diagrams is excellent method to determine the distribution of the samples values in the audio files. Figure 7a shows the histogram of one of the tested plain files and Figure 7b shows the histogram of the corresponding encrypted file. The distribution of the values in Figure 7b are very close and uniform, indicating strong encryption. The close values, also indicates resistance against attacks.

Correlation Analysis
Measuring correlation coefficient between two audio files express the dependency between their corresponding sample values. This is another statistical evaluation for testing the quality encryption algorithms. Calculating correlation coefficient determines the level of correlation between two files and the correlation coefficient is always in range [− 1,1]. Values between |1-0.7| is considered as strong correlation (samples from the plain files are similar to samples from the encrypted file), correlation between |0.7-0.3| is considered as medium correlation and values between |0.3-0| is considered as weak correlation.
Correlation coefficient can be calculated as follows: where N is the total number of samples, x i and y i are the sample values of the plain and encrypted files, x and y are the mean values of samples, and finally cov(x, y) is covariance between both files. Table 4 shows the obtained result values from our tests. The results in Table 4 indicates values close to zero which means there is no dependence between the two files. The results also mean high quality of the encryption.

Number of Sample Change Rate
Number of sample change rate (NSCR) is robustness test for establishing the quality of encryption algorithms. The purpose of the test is to compare the corresponding sample values of the original and encrypted audio files and to show the difference in percents. NSCR can be calculated as follows: where In Equation (4), N is the total number of samples, x i and y i are the corresponding sample values of the plain and encrypted files. Table 5 shows the obtained result values from our tests. The results clearly demonstrate the complete difference between the original and encrypted files, indicating high security level of the proposed audio encryption scheme.

Signal to Noise Ratio
Signal to Noise Ratio (SNR) is widely user to determine the quality of the signals [2,23]. Values grater than 0 dB indicates the clear signal is more than the noise. For this test we need both plain and encrypted audio files and SNR is calculated as follows: where x i and y i are corresponding sample values from audio files, and N is number of samples.
The results from our SNR tests are shown in the next Table 6. All the obtained values for SNR are negative which means the encrypted files are very noisy and the encryption method completely destroys the clear signal from plain audio files.

Peak Signal to Noise Ratio
Peak Signal to Noise Ratio (PSNR) is different approach to measure the power of clean signal against the power of noise. PSNR is more applicable for image encryption algorithms, but can be used for testing the quality of the proposed encryption scheme in this paper. PSNR is calculated as follows: where MAX is the maximum possible value of audio stream (In our case the maximum value is 65,535) and MSE is the mean square error between the plain and encrypted file. MSE is defined as: where N is the total number of samples, x i and y i are the corresponding sample values of the plain and encrypted files. Table 7 contains the results of our tests. All the obtained values for PSNR are close to zero (or below) indicating very high level of noise in the encrypted audio files.

Encryption/Decryption Key Sensitivity
In Section 2.4.5 we performed key sensitivity test of the PRG used for proposed the audio encryption algorithm in this paper. Analyzing the general behavior of the encryption method we used very similar keys to encrypt and to restore encrypted audio file. The decryption key is obtained by changing a single digit from one of the variables constructing the key space. Key 1 and Key 2 are described in Section 2.4.5.
Both Figures 8 and 9 demonstrate that the decryption is unsuccessful even with very similar secret key. Changing a single digit of the key leads to fail decryption. The experiment is proof of high key sensitivity concerning the proposed audio encryption algorithm. Magnitude, dB (c) Spectrogram of decrypted file using secret key K2 Figure 9. Spectrogram plotting-key sensitivity.

Speed Performance
To measure the necessary encryption time we used audio files with different size with hardware configuration-2.40 GHz Intel R Core TM i7-3630QM Dell Inspiron, 8 GB RAM, Windows 7. Table 8 contains the results of our tests.

Conclusions
This paper evaluates a new design for audio files encryption algorithm. The proposed cryptographic algorithm relies on permutation-substitution architecture realized by using chaotic circle map and modified rotation equations. Extended cryptographic analysis is performed for testing the proposed method for security. The waveform plots and the spectrograms of the tested audio files demonstrate the changes in encrypted files compared to plain files. The correlation analysis and NSCR tests confirm the high quality of encryption, demonstrating the sample values are completely different in corresponding files. The measured SNR and PSNR values show high levels of noise in the encrypted files, indicating the original signal is destroyed in the encryption process. Key space analysis shows the necessary level of security against brute-force attacks and key sensitivity analysis shows that even minimal change of the secret key leads to unsuccessful decryption. Considering the obtained results during the cryptographic analysis, we can conclude that the proposed algorithm has the necessary cryptographic security for audio files encryption.
Funding: This research received no external funding.