A Selective Video Encryption Scheme Based on Coding Characteristics

: The protection of video data has become a hot topic of research. Researchers have proposed a series of coding algorithms to ensure the safe and e ﬃ cient transmission of video information. We propose an encryption scheme that can protect video information with higher security by combining the video coding algorithm with encryption algorithm. The H.264 / AVC encoding algorithm encodes the video into multiple slices, and the slices are independent of each other. With this feature, we encrypt each slice while using the cipher feedback (CFB) mode of the advanced encryption standard (AES) with the dynamic key. The key is generated by the pseudo-random number generator (PRNG) and updated in real time. The encryption scheme goes through three phases: constructing plaintext, encrypting plaintext, and replacing the original bitstream. In our scheme, we encrypt the code stream after encoding, so it does not a ﬀ ect the coding e ﬃ ciency. The purpose of the CFB mode while using the AES encryption algorithm is to maintain the exact same bit rate and produce a format compatible bitstream. This paper proposes a new four-dimensional (4-D) hyperchaotic algorithm to protect data privacy in order to further improve the security of video encryption. Symmetric encryption requires that the same key is used for encryption and decoding. In this paper, the symmetry method is used to protect the privacy of video data due to the large amount of video encrypted data. In the experiment, we evaluated the proposed algorithm while using di ﬀ erent reference video sequences containing motion, texture, and objects.


Introduction
Sharing video in multimedia applications has become an indispensable part of people's lives, especially in the medical industry [1] and military applications, due to the continuous development of multimedia applications and network technologies. With the continuous maturity and in-depth application of office automation in the medical industry, as the core information assets of medical care, these data assets are intentionally or unintentionally leaked, which will cause economic and reputation loss to the continuous operation of the enterprise, and even face more strict regulatory penalties. Therefore, some researchers have invested a lot of effort to further improve the video encryption scheme to ensure the security of private video data.
We all know that video encryption is generally divided into full encryption and selective encryption [2,3], both of which have advantages and disadvantages. Full encryption is usually suitable for small amounts of encrypted data and it requires high security. However, selective encryption is suitable for large amounts of encrypted data and strong real-time performance. Video encryption is a very time-consuming process due to the large amount of video data. Many video encryption algorithms are proposed according to the different semantic elements of encryption. Some schemes are focus on H.264/AVC video encryption systems. In the third section, we will introduce our proposed selective video encryption scheme. In the fourth section, the experimental results and experimental analysis of our proposed selective encryption scheme are given. Finally, we completed the study by giving conclusions.
This paper is organized, as follows: In Section 1, we briefly introduce some application backgrounds of video encryption and some common methods of video encryption. In Section 2, we will focus on H.264/AVC video encryption system. In Section 3, we will introduce our selective video encryption scheme. We present the experimental results and analyses in Section 4. In Section 5, we elaborate the relevant conclusions of the paper.

Description of Video Encryption Scheme Based on H.264/AVC
We all know that H.264/AVC [24] is a highly compressed digital video codec standard jointly proposed by the ITU-T Video Coding Experts Group (VCEG) and the ISO/IEC Motion Picture Experts Group (MPEG). We introduce the relevant knowledge in this section. These related works are closely related to our proposed selective video encryption scheme that is based on encoding characteristics.

H.264/AVC Bit-Stream Syntax Structure
We will focus on analyzing the syntax structure of the H.264/AVC bitstream to propose a format compatible and secure video encryption scheme. By reading the paper, we know that H.264/AVC has the characteristics of bit-stream hierarchy [11], and the video stream is composed of individual syntax elements. Figure 1 shows the data hierarchy structure of H. 264/AVC bitstream.
In the H.264 structure, the encoded video is composed of group of picture (GOP), and the image group is composed of frame images. A frame image is composed of one or more pieces, one piece is composed of one or more macroblocks (MB) and one macroblock is composed of 16 × 16 YUV data. section, we will focus on H.264/AVC video encryption systems. In the third section, we will introduce our proposed selective video encryption scheme. In the fourth section, the experimental results and experimental analysis of our proposed selective encryption scheme are given. Finally, we completed the study by giving conclusions. This paper is organized, as follows: In Section 1, we briefly introduce some application backgrounds of video encryption and some common methods of video encryption. In Section 2, we will focus on H.264/AVC video encryption system. In Section 3, we will introduce our selective video encryption scheme. We present the experimental results and analyses in Section 4. In Section 5, we elaborate the relevant conclusions of the paper.

Description of Video Encryption Scheme Based on H.264/AVC
We all know that H.264/AVC [24] is a highly compressed digital video codec standard jointly proposed by the ITU-T Video Coding Experts Group (VCEG) and the ISO/IEC Motion Picture Experts Group (MPEG). We introduce the relevant knowledge in this section. These related works are closely related to our proposed selective video encryption scheme that is based on encoding characteristics.

H.264/AVC Bit-Stream Syntax Structure
We will focus on analyzing the syntax structure of the H.264/AVC bitstream to propose a format compatible and secure video encryption scheme. By reading the paper, we know that H.264/AVC has the characteristics of bit-stream hierarchy [11], and the video stream is composed of individual syntax elements. Figure 1 shows the data hierarchy structure of H. 264/AVC bitstream.
In the H.264 structure, the encoded video is composed of image group group of picture (GOP), and the image group is composed of frame images. A frame image is composed of one or more pieces, one piece is composed of one or more macroblocks (MB) and one macroblock is composed of 16 × 16 YUV data.  Figure 1 contains three different types of frames (I-frame, P-frame and B-frame). Among them, I frames are encoded independently of other images. P-frames are encoded using the predictions of previous I or P-frames. B frames use double prediction of the previous and next I or P frames. The GOP sequence takes I frame as a start frame, and it is a periodic sequence. Each frame is sliced to limit error transmission.

Pseudo Random Sequence Generator
For known encryption algorithms, the encryption algorithm is more secure if the key is more difficult to obtain. Most researchers believe that pseudo-random sequence generator is a good choice  Figure 1 contains three different types of frames (I-frame, P-frame and B-frame). Among them, I frames are encoded independently of other images. P-frames are encoded using the predictions of previous I or P-frames. B frames use double prediction of the previous and next I or P frames. The GOP sequence takes I frame as a start frame, and it is a periodic sequence. Each frame is sliced to limit error transmission.

Pseudo Random Sequence Generator
For known encryption algorithms, the encryption algorithm is more secure if the key is more difficult to obtain. Most researchers believe that pseudo-random sequence generator is a good choice Symmetry 2020, 12, 332 4 of 15 in order to achieve this goal. These sequences generated by a pseudo-random sequence generator can be used as a dynamic key for a cryptographic system, which confirms the security of the encryption algorithm. The chaotic function is very sensitive to the initial value and it is close to the noise characteristics, so using the chaotic function to make a pseudo-random sequence generator is a good choice. However, the key length of the chaotic function is short, and the chaotic value distribution is uneven. Therefore, Xu et al. [25] proposed an effective pseudo-random sequence generator. Figure 2 shows the working mode of PSNG.
in order to achieve this goal. These sequences generated by a pseudo-random sequence generator can be used as a dynamic key for a cryptographic system, which confirms the security of the encryption algorithm. The chaotic function is very sensitive to the initial value and it is close to the noise characteristics, so using the chaotic function to make a pseudo-random sequence generator is a good choice. However, the key length of the chaotic function is short, and the chaotic value distribution is uneven. Therefore, Xu et al. [25] proposed an effective pseudo-random sequence generator. Figure 2 shows the working mode of PSNG.
As shown in Figure 2, the Pseudo-random number generator (PRNG) is composed of a linear feedback shift register (LFSR), a bit recombination (BR), and nonlinear function F. Among them, R1 and R2 are two 32-bit memory cells, and F1 and F2 are two chaotic functions, respectively. PRNG execution produces a 32-bit Z at a time, which forms the output of the key stream by combining 32bit Z. Literature [25] suggests that PRNG discards the first 100 values and uses post-order values, given the security of the password.

AES Encryption Algorithms
As a block encryption algorithm, AES supports multiple encryption modes to meet different security requirements and transmission requirements. For example, Electronic codebook (ECB), Cipher-block chaining (CBC), Cipher feedback (CFB), Output feedback (OFB), and so on. In fact, the ECB mode is a common encryption mode of the AES. In CFB mode, CFB converts block ciphers to stream ciphers. Figure 3 shows the working mode of CFB. As shown in Figure 2, the Pseudo-random number generator (PRNG) is composed of a linear feedback shift register (LFSR), a bit recombination (BR), and nonlinear function F. Among them, R 1 and R 2 are two 32-bit memory cells, and F 1 and F 2 are two chaotic functions, respectively. PRNG execution produces a 32-bit Z at a time, which forms the output of the key stream by combining 32-bit Z. Literature [25] suggests that PRNG discards the first 100 values and uses post-order values, given the security of the password.

AES Encryption Algorithms
As a block encryption algorithm, AES supports multiple encryption modes to meet different security requirements and transmission requirements. For example, Electronic codebook (ECB), Cipher-block chaining (CBC), Cipher feedback (CFB), Output feedback (OFB), and so on. In fact, the ECB mode is a common encryption mode of the AES. In CFB mode, CFB converts block ciphers to stream ciphers. Figure 3 shows the working mode of CFB.   Figure 3a CFB mode encryption. The encryption process requires a shift register that is the same size as the block to initialize the register. Then, the contents of the register are encrypted using a block cipher. Figure 3b CFB mode decryption. Based on Exclusive OR (XOR) operation, CFB mode decryption can be implemented.
The generation of the key stream Zi and the ciphertext block Ci is as follows: There, ⊕ is an XOR operation.

The Four-Dimensional Hyperchaotic System
The four-dimensional hyperchaotic system can be expressed, as follows:  Figure 3a CFB mode encryption. The encryption process requires a shift register that is the same size as the block to initialize the register. Then, the contents of the register are encrypted using a block cipher. Figure 3b CFB mode decryption. Based on Exclusive OR (XOR) operation, CFB mode decryption can be implemented.
The generation of the key stream Z i and the ciphertext block C i is as follows: There, ⊕ is an XOR operation.

The Four-Dimensional Hyperchaotic System
The four-dimensional hyperchaotic system can be expressed, as follows: where a, b, c, and d are system control parameters. If the parameters are set to a = 35, b = 3, c = 12, d = 7, and e = 0.2, the four-dimensional chaotic system is hyperchaotic, and the system can generate four hyperchaotic sequences (y 1 , y 2 , y 3 , y 4 ).

Video Selective Encryption Scheme
We know that researchers encrypt semantic elements through different combinations to make the video content as obscure as possible, according to the previous introduction. Video encryption usually needs to consider compatibility, security, and timeliness, so it is important to propose a selective encryption algorithm to protect video privacy.

Selection of Important Syntax Elements
Selective encryption is a technology that saves time by encrypting a small amount of bitstream data, but it still has sufficient security [26]. Analyzing the research work of the predecessors, we found that important semantic elements in the video stream were encrypted in most video encryption schemes. For example, IPM, MVD, residual coefficient, and so on.
In reference [13], It is necessary to correctly guess that the MVD symbol has 2119904 possibilities, so the probability that the MVD symbol can be fully guessed is 1/2119904. Suppose that the attacker needs to correctly guess 80% of the MVD symbols to make the video visible, and guess that the combination of 80% MVD symbols has C 95923 119904 . It is very difficult to correctly guess 80% of the MVD symbols. Moreover, the analysis given [8] is reasonable in encrypting spatial information and motion information during H.264/AVC encoding. However, the scheme of encrypting the syntax elements only encrypts the equal-length codes to maintain the same bit rate as the original code stream [14]. Therefore, a selective video encryption scheme with high security, high efficiency, and format compatibility is proposed in this paper. When compared with reference [8], our proposed encryption scheme directly encrypts the semantic elements in the H.264 bitstream, thereby directly changing the code stream. From previous research work, we know that a video is encoded into multiple frames, and a frame consists of multiple fragments, and these fragments are independent of each other. According to this idea, we use the independence between video slices in the proposed encryption scheme. The AES with CFB mode is used to encrypt the IPM, MVDs, residual coefficient, and Delta QP [27] semantic elements in video. More importantly, the key between the slices is updated in due course. This keeps the encrypted video format compatible and real-time secure. The following video semantic elements need to be encrypted.
Motion Vector Difference symbol (MVDs): We need to encrypt motion vectors in order to destroy the movement information. In H.264/AVC, the motion vector mv is further predicted to obtain mv', and the motion vector difference mv d = mv − mv'. In the H.264/AVC baseline profile, the MVD is encoded while using Exp-Golomb entropy coding. The value of MVD and the corresponding Exp-Golomb codeword is indicated in the paper [28], and the last bit of the Exp-Golomb codeword is given to affect the MVD symbol. Therefore, we only need to encrypt the last bit of the Exp-Golomb codeword in the motion vector difference.
Residual coefficient: We need to encrypt some other sensitive data, that is, residual data, in order to further promote the security of the encryption video. In the H.264/AVC baseline profile, the format of quantization coefficient [29] used to encode residual blocks is as follows: {coeff_token, sign_of_trailing ones, level, total_zeros, run_before}. The encryption of the residual coefficients is done by modifying the trailing ones sign and the level codeword [28,30]. The encryption of the residual coefficients is implemented in our scheme by replacing the trailing ones sign, level, and totalzeros codewords.
Delta QP: Differences in quantization parameters (QP) can affect video quality. We need to encrypt the Delta QP parameters in order to make the texture information more distorted.

Selective Encryption Process
We select the IPM, MVDs, and the residual coefficients affecting the texture information, Delta QP [27] to encrypt the video, by analyzing the structure of H.264/AVC video code stream. Figure 4 shows the flow chart of selective video encryption. totalzeros codewords.
Delta QP: Differences in quantization parameters (QP) can affect video quality. We need to encrypt the Delta QP parameters in order to make the texture information more distorted.

Selective Encryption Process
We select the IPM, MVDs, and the residual coefficients affecting the texture information, Delta QP [27] to encrypt the video, by analyzing the structure of H.264/AVC video code stream. Figure 4 shows the flow chart of selective video encryption.
The H.264 encoder encodes a video into multiple slices, and the slices are independent of each other. We use this feature of video coding to propose an efficient and secure video encryption scheme. We regard = ( ) ⊕ as the mark of the CFB mode while using key k to encrypt the n-bit . Figure 3 shows the implementation scheme. We chose to use CFB mode to maintain the compression ratio. As can be seen from Figure 3, plaintext block is the same size as the ciphertext block . Furthermore, ciphertext from the previous block is directly used to encrypt the current block. The algorithm proposed in this paper has three stages: constructing the plaintext block , encrypting the plaintext block , and replacing the encrypted information with the original codeword.
Plaintext construction: Since the slices are independent from each other, we encrypt the important semantic elements of each slice in units of slices while using the PRNG generation key in Section 2.2. In our scheme, we can copy the important semantic elements of each slice in the H.264 code stream directly to the vector to create the plaintext, and until the vector is completely filled or reaches the edge of the slice. We think that L is the size of the vector , and ( ) is the length at which the vector is filled. Where L = 128. If ( ) < , we fill with ( ) = 0 , where ∈{ ( ) + 1, … , }. From the perspective of encryption algorithms, padding is to improve the security of the algorithm. Encrypting plaintext while using CFB mode of AES. Figure 3a shows the encryption process for CFB mode.
is created by the previous ciphertext block . According to Formula (1), we perform xor operation between Z and plaintext block P to achieve the encryption of the current plaintext block. In the key generation phase, we used the PRNG that was proposed by Xu et al. [25] in Section 2.2 to generate the key. For security reasons, we will consider the sequence after the 100th 32-bit as the key stream.
Replace the original bit: this is the last stage of encryption. In this phase, we replace the vector in the original stream with the encrypted ciphertext vector . The H.264 codestream is sequentially accessed in the plaintext construction phase, given the length of each important semantic element The H.264 encoder encodes a video into multiple slices, and the slices are independent of each other. We use this feature of video coding to propose an efficient and secure video encryption scheme. We regard C i = E key (C i−1 ) ⊕ P i as the mark of the CFB mode while using key k to encrypt the n-bit P i . Figure 3 shows the implementation scheme. We chose to use CFB mode to maintain the compression ratio. As can be seen from Figure 3, plaintext block P i is the same size as the ciphertext block C i . Furthermore, ciphertext from the previous block is directly used to encrypt the current block. The algorithm proposed in this paper has three stages: constructing the plaintext block P i , encrypting the plaintext block P i , and replacing the encrypted information with the original codeword.
Plaintext construction: Since the slices are independent from each other, we encrypt the important semantic elements of each slice in units of slices while using the PRNG generation key in Section 2.2. In our scheme, we can copy the important semantic elements of each slice in the H.264 code stream directly to the vector P i to create the plaintext, and until the vector P i is completely filled or reaches the edge of the slice. We think that L is the size of the vector P i , and L(P i ) is the length at which the vector P i is filled. Where L = 128. If L(P i ) < L, we fill P i with P(i) = 0, where i ∈{L(P i ) + 1, . . . , L}. From the perspective of encryption algorithms, padding is to improve the security of the algorithm. Encrypting plaintext while using CFB mode of AES. Figure 3a shows the encryption process for CFB mode. Z i is created by the previous ciphertext block C i−1 . According to Formula (1), we perform xor operation between Z and plaintext block P to achieve the encryption of the current plaintext block. In the key generation phase, we used the PRNG that was proposed by Xu et al. [25] in Section 2.2 to generate the key. For security reasons, we will consider the sequence after the 100th 32-bit as the key stream.
Replace the original bit: this is the last stage of encryption. In this phase, we replace the P i vector in the original stream with the encrypted ciphertext vector C i . The H.264 codestream is sequentially accessed in the plaintext P i construction phase, given the length of each important semantic element (l n , l n−1 , . . . , l 1 ). Subsequently, we replace the corresponding part of the plaintext P i in the H.264 stream with the ciphertext C i . At this point, our proposed selective encryption scheme ends. Figure 3b shows the decryption process for the CFB mode. The ciphertext C i−1 of the previous block is used as an input of the AES encryption algorithm to generate Z i (here, the same encryption function E key and encryption key k as the encryption phase are used.), and the current ciphertext block C i and Z i are XORed to generate a plaintext block P i . Repeat the above operation until the plaintext vector P i is generated. Afterwards, the plaintext vector P i is divided into segments of length l n , l n−1 , . . . . . . ,l 1 to replace the corresponding stream in ciphertext to generate the original stream. Figure 5 shows the encryption and decryption process of the proposed 4-D hyperchaotic system. Figure 4 is selective video encryption, which mainly encrypts important elements in the video slice. At the same time, there is independence between video slices. It can be seen from Figure 5 that we use the 4-D hyperchaotic system to encrypt selective video encryption results. Therefore, the selective video encryption result is the input of the 4-D hyperchaotic system for Figure 5. At this time, the ciphertext image in Figure 5 is a result that is based on video encoding and 4-D hyperchaotic hybrid encryption.  Figure 3b shows the decryption process for the CFB mode. The ciphertext of the previous block is used as an input of the AES encryption algorithm to generate (here, the same encryption function E and encryption key k as the encryption phase are used.), and the current ciphertext block and are XORed to generate a plaintext block . Repeat the above operation until the plaintext vector is generated. Afterwards, the plaintext vector is divided into segments of length , ,……, to replace the corresponding stream in ciphertext to generate the original stream. Figure 5 shows the encryption and decryption process of the proposed 4-D hyperchaotic system. Figure 4 is selective video encryption, which mainly encrypts important elements in the video slice. At the same time, there is independence between video slices. It can be seen from Figure 5 that we use the 4-D hyperchaotic system to encrypt selective video encryption results. Therefore, the selective video encryption result is the input of the 4-D hyperchaotic system for Figure 5. At this time, the ciphertext image in Figure 5 is a result that is based on video encoding and 4-D hyperchaotic hybrid encryption.

Experimental Results and Analysis
In this section, we will present the results of selective video encryption. We used the JM8.6 reference software in the video coding phase. In the experiment, we analyzed the proposed selective encryption scheme using seven reference sequences with a frame rate of 30 frames/s and a format of

Experimental Results and Analysis
In this section, we will present the results of selective video encryption. We used the JM8.6 reference software in the video coding phase. In the experiment, we analyzed the proposed selective encryption scheme using seven reference sequences with a frame rate of 30 frames/s and a format of QCIF (176 × 144). These seven reference video sequences are combined by different motions, colors, contrasts, and objects. Each of the seven reference sequences has 300 frames for experimental analysis, and the structure of the GOP is "IPPP".

Security Analysis of Selective Video Encryption Scheme
Perceptual security means that the video sequence after encryption cannot be correctly identified, depending on the attributes of the encryption. In this section, we will discuss the perceived security of our proposed encrypted video images from two aspects: subjective video quality and objective video quality.

Subjective Video Quality Analysis
In our proposed scheme, we encrypt the IPM, RC, Delta QP, and MVDs semantic elements in the video sequence by means of fragmentation, which guarantees the visual perception security of video encryption. This paper proposes a new five-dimensional (5-D) hyperchaotic algorithm to protect data privacy in order to further improve the security of video encryption. Only four benchmark sequence renderings are shown here due to space limitations ( Figure 6). This includes the original video image (OVI) (OVI 1, OVI 2, OVI 3, OVI 4), Selective video encryption output (SVEO) (SVEO 1, SVEO 2, SVEO 3, SVEO 4), Mixed encrypted output (MEO) (MEO 1, MEO 2, MEO 3, MEO 4), and the decrypted video image (DVI ) (DVI 1, DVI 2, DVI 3, DVI 4). Where, the first column in Figure 6 is the original video image. The second column in Figure 6 is selective video encryption output. The third column in Figure 6 is the mixed encrypted output. The last column in Figure 6 is the decrypted video image. The mixed encryption output is the mixed encryption result. The specific step is to perform selective encryption first and then perform 5-D hyperchaotic encryption.
In Figure 6, if the user needs to ensure sufficient real-time and large amount of data, we recommend that the user choose a selective video encryption algorithm. This method can encrypt video data in real time to a certain extent. We recommend that users choose a hybrid encryption method, which mainly focuses on video encryption security, if the user needs sufficient security and the amount of data is small. Figure 6 shows the subjective evaluation of the algorithm. In addition, the encrypted video sequence maintains the same size as the original video sequence (as shown in Table 1), and the bitstream format compatibility is well accomplished.  Table 1 shows that the size of the reference video sequence is 10 KM in our encryption scheme. The video sequence size has significantly dropped after H.264 encoding. Additionally, the encrypted video bit stream is the same size as the encoding video bit stream, which retains the original bit characteristics very well.
(SVEO 1, SVEO 2, SVEO 3, SVEO 4), Mixed encrypted output (MEO) (MEO 1, MEO 2, MEO 3, MEO 4), and the decrypted video image (DVI ) (DVI 1, DVI 2, DVI 3, DVI 4). Where, the first column in Figure 6 is the original video image. The second column in Figure 6 is selective video encryption output. The third column in Figure 6 is the mixed encrypted output. The last column in Figure 6 is the decrypted video image. The mixed encryption output is the mixed encryption result. The specific step is to perform selective encryption first and then perform 5-D hyperchaotic encryption  Figure 6 is the original video image. The second column in Figure 6 is selective video encryption output. The third column in Figure  6 is the mixed encrypted output. The last column in Figure 6 is the decrypted video image. Figure 6. Subjective evaluation of the algorithm. The first column in Figure 6 is the original video image. The second column in Figure 6 is selective video encryption output. The third column in Figure 6 is the mixed encrypted output. The last column in Figure 6 is the decrypted video image.

Objective Video Quality Analysis
In this section, we also used Peak Signal to Noise Ratio (PSNR) and Structural Similarity Index (SSIM) to measure the perceived effect of the video. PSNR is widely used to objectively evaluate video quality. A larger value of PSNR indicates a better video quality, and vice versa. The calculation formula of PSNR is obtained according to the mean square error (MSE) of the error of two images. The calculation formula of MSE is expressed by Equation (3), and the calculation formula of PSNR is represented by Equation (4). However, PSNR does not evaluate video quality very well because of the nonlinearity of the human visual nervous system. Therefore, it has been proposed to use SSIM to evaluate the similarity of two images. The value of SSIM ranges from 0 to 1. A value that is close to 1 indicates that the reference image is more similar to the target image.
Symmetry 2020, 12, 332 11 of 15 Therefore, the formula for calculating PSNR is described, as follows: where, H and W indicate the height and width of the video image, respectively. P org (x, y) show the pixel value of the original video image, and P enc (x, y) indicate the pixel value of the encrypted video image. k represents the number of bits per pixel, which is generally 8. Table 2 gives the average PSNR values of the video after encoding and after encryption. Through experiments, we know that the benchmark sequence outputs 264 streams after passing through the H.264 encoder. However, our proposed selective encryption scheme encrypts the important semantic elements in the 264 code stream, and the encrypted video file is still 264 streams. In the Table 2, we compare the code stream after encoding with the code stream after encryption to calculate the PSNR value. Similarly, Table 3 provides the average SSIM value of the encrypted video. We observe Table 3 to find that the video image after encryption is quite different from the video image after encoding.

Evaluation Comparison
In this section, we will further elaborate the objective evaluation indicators of the proposed algorithm. It includes original image performance indicators, selective video encryption performance, and hybrid encryption performance. We choose information entropy, PSNR, and SSIM to evaluate the performance of the algorithm. Table 4 shows the objective performance evaluation of the algorithm. We randomly selected two video images to evaluate the performance of the algorithm. The hybrid encryption algorithm is superior to the original selective video encryption from a security perspective, as can be seen from Figure 4. The closer the value of the information entropy is to 8, the better the algorithm's encryption performance. The smaller the PSNR value, the better the algorithm's encryption performance. The smaller the value of SSIM, the better the encryption performance of the algorithm.
We compare the encryption scheme that is presented in this article with the other seven newest video encryption schemes. These encryption schemes use different algorithms to encrypt different parameters in the video. Therefore, we mainly compare encryption parameters, encryption algorithms, format compatibility, and bit addition. Table 5 gives the results of the comparison. Table 5. Comparison of existing video encryption schemes with proposed encryption schemes.

Existing Schemes Encrypted Semantic Element Format Compliant
Bit Increase

Encryption Algorithm
Xu [25] IPM, MVDs, T1s, signs of the NZ coefficients yes no Chaos Abomhara [31] I frame no no AES Shahid [32] T1s, NZ level yes no AES Fei [33] IPM, MVD, Signs of residual yes yes Chaos Sung [34] Motion vector yes yes RC4 Wei [35] NALUs yes yes RC4 Wang [36] IPM, MVD, Quantization coefficients yes yes Hash and AES Ours IPM, MVDs, Signs of residual, delta QP yes yes Chaos and AES As can be seen from Table 5, the format compatibility of the encrypted video data is a research hotspot of video encryption. We all know that the target of video encryption is to damage the economic value of video. In general, our proposed video selective encryption scheme is completely compliant with the decoder, and the compression ratio has not changed, and it has a sufficiently long key length to protect the video information.

Security Analysis
In the video selective encryption scheme, we use a pseudo-random sequence generator (PSNG) that is composed of double chaotic maps to generate an encryption key. The semantic features of the H.264 slice are used to selectively encrypt important semantic elements in the video sequence. After reading a large amount of literature, we know that the video sequence after H.264 encoding is divided into multiple slices, and the slices are independent of each other. This scheme uses this feature to achieve the selective video encryption. We will analyze the security of selective video encryption schemes from two other aspects.

Key Volume
The key volume is another factor in measuring password security. In this scheme, we did not use AES's original key generation scheme, but used the pseudo-random key generator that was proposed by Xu et al. [25] to generate the key. The generator's key space is 2 (2 * 32+16 * 31) * 2 * 2 32 2 * 2 32 = 2 626 . It can be seen that the key volume is very large enough to withstand exhaustive attack. The 4-D hyperchaotic system is introduced to further protect the privacy of video images, which can meet the requirements of small amount of video data and strong security video encryption. The size of the key space of the 4-D hyperchaotic system is: 10 15 × 10 16 × 10 16 × 10 16 × 10 16 × 10 16 = 10 95 ≈ 2 315 . Therefore, the proposed algorithm has strong security.

Security Analysis of Selective Video Encryption Scheme
We use the fragmentation feature of H.264 video coding to encrypt important semantic elements in the video. In our experiments, we encode the video into 300 independent slices and encrypt the important semantic elements in each slice. The encrypted video size is the same as the encoded video size (Table 1 has been given). The security of this scheme not only depends on the security of the encryption algorithm, but also on the security of each slice in the video. Fortunately, the slices after H.264 encoding are independent of each other. Therefore, even if one slice in the video is correctly decrypted, it will not affect the security of other video information.
For the convenience of experiment, we encrypt each video frame as a slice. The benchmark video sequence we use is 300 frames, so each video has 300 slices, so each video is encrypted with 300 keys. We know that it is very difficult to correctly guess a key, and then the probability that 300 keys are correctly guessed is almost zero. Furthermore, we know that each video frame can be encoded into a plurality of mutually independent slices. If each video frame is encoded as n (n > 1) slices, then n * 300 keys are required to encrypt a video. In this case, it is even more difficult to obtain a clear video.
In the encryption process, we choose the CFB mode of AES to ensure that the video is encrypted and it still meets the requirements of real-time and format compatibility. Furthermore, the AES algorithm for 128-bit encryption keys is a good choice compared to most encryption algorithms, because it is assumed that all possible keys are used to crack 128-bit keys at 50 billion keys per second. The required time is 5 × 1021 years. Therefore, the AES algorithm can not only guarantee the security of the encryption mean, but also ensure the real-time and format compatibility requirements after the video is encrypted. Therefore, theoretically speaking, our selective video encryption scheme is superior to other video encryption schemes. Figure 5 depicts a subjective evaluation of the algorithm. The first column in Figure 5 is the original video image. The middle column in Figure 5 is the encrypted video image. The last column in Figure 5 is the decrypted video image. Table 2 illustrates the peak signal-to-noise ratio of the algorithm in each color space. Table 3 shows the structural similarity index of the encryption algorithm. The smaller the peak signal-to-noise ratio and the lower the structural similarity index, the better the performance of the selected encryption algorithm.
In Table 4, we will further elaborate the objective evaluation indicators of the proposed algorithm. It includes original image performance indicators, selective video encryption performance, and hybrid encryption performance. We choose information entropy, PSNR, and SSIM to evaluate the performance of the algorithm. Table 5 compares the existing video encryption schemes. The algorithm performance comparison analysis is completed from the perspectives of encryption semantic elements, format complexity, bit number impact, and encryption unit. The experimental results show that the proposed selective encryption algorithm has better performance. Therefore, the security analysis of selective video encryption algorithm is given in this section.
In summary, this section mainly simulates and analyzes the proposed algorithm. From the experimental results, the performance of the proposed algorithm is better than that of most current selective video encryption algorithms.

Conclusions
The selective video encryption scheme that is proposed in this paper cleverly utilizes the segmentation characteristics of H.264 encoding and encrypts the key semantic elements (IPM, MVD, residual coefficient, delta_QP) in each slice. The main contribution of this paper is to propose a new selective video encryption method that is based on video coding technology and 4-D hyperchaotic system. This method provides users with two alternatives: (1) When the amount of encrypted data is small and sufficient security is required, the user can choose to complete selective video encryption based on video encoding and 4-D hyperchaotic system; (2) When the amount of encrypted video data is large and sufficient real-time is required, users can choose a method that is based on video characteristic encoding to implement selective video encryption. In this paper, we use different reference video sequences containing motion, texture, and objects to analyze the perceived quality of encrypted video. Its evaluation indicators include information entropy, PSNR, and SSIM. The experiments show that our proposed video encryption scheme has better encryption effect and less encryption time. In addition, we compare the selective video encryption scheme proposed in this paper with other existing video encryption schemes. Theoretical analysis confirms that our proposed selective encryption scheme is superior to other video encryption schemes.