Privacy Protection in Real Time HEVC Standard Using Chaotic System †

: Video protection and access control have gathered steam over recent years. However, the most common methods encrypt the whole video bit stream as unique data without taking into account the structure of the compressed video. These full encryption solutions are time and power consuming and, thus, are not aligned with the real-time applications. In this paper, we propose a Selective Encryption (SE) solution for Region of Interest (ROI) security based on the tile concept in High Efﬁciency Video Coding (HEVC) standards and selective encryption of all sensitive parts in videos. The SE solution depends on a chaos-based stream cipher that encrypts a set of HEVC syntax elements normatively, that is, the bit stream can be decoded with a standard HEVC decoder, and a secret key is only required for ROI decryption. The proposed ROI encryption solution relies on the independent tile concept in HEVC that splits the video frame into independent rectangular areas. Tiles are used to pull out the ROI from the background and only the tiles ﬁguring the ROI are encrypted. In inter coding, the independence of tiles is guaranteed by limiting the motion vectors of non-ROI to use only the unencrypted tiles in the reference frames. Experimental results have shown that the encryption solution performs secure video encryption in a real time context, with a diminutive bit rate and complexity overheads.


Introduction
High Efficiency Video Coding (HEVC) is the newest video coding standard issued by the Video Coding Experts Group (VCEG) and the ISO/IEC Moving Picture Experts Group (MPEG) [1]. The most important goal of the HEVC standardization efforts is to let 50% bitrate decrease for similar video quality [2], in contrast to its ancestor H.264/Advanced Video Coding (AVC) [3]. In the future, HEVC is expected to substitute the previous video coding standards in the trend applications, such as High Dynamic Range (HDR), Virtual Reality (VR), High Frame Rate (HFR), ultra high resolutions (4K and 8K), and so forth. In such applications, security and confidentiality of image and video content are of fundamental importance for privacy protection. Thus, several studies have been committed to these goals in the last decade.
In general, the most regular approach for content protection is to encrypt the whole bit stream. In this method, the video bit stream is addressed as simple text data without taking into consideration the structure of the compressed video, and it is decodable only after the right decryption, although only some parts of the video are encrypted. This method limits the usage of the content to only users who have access rights to the encrypted parts. In addition, these algorithms are time and power consuming and not proper for real-time video applications mainly on mobile platforms. Consequently, Selective Encryption (SE) has emerged as a beneficial solution to these full encryption problems [4][5][6][7].
The main objective of SE is to decrease the amount of information to encrypt while keeping an adequate level of security. Thus, only the most sensitive information in the bit-stream is encrypted. In this work, we concentrate on SE that only hides the Region of Interest in the video (human faces, personal data, etc.) and retains the background of the video as is. In our approach, the HEVC frames is first divided into independent rectangular sections called tiles [8] and then only the tiles relating to the ROI are encrypted.
The proposed work encrypts a group of HEVC parameters encompassing Motion Vector Differences (MVD), Motion Vector (MV) signs, Transform Coefficient (TC), TC signs, as given in Reference [9]. Moreover, we propose a format-compliant encryption algorithm of the luma and chroma Intra Prediction Modes (IPM). The proposed SE solution relies on the chaos-based stream cipher which based on a chaotic keystream generator published in References [10,11]. It includes a set of HEVC coding restrictions to deny the encryption propagation out of the ROI under an Inter coding configuration. The encryption and decryption operations are endorsed in practice by implementing them in the real-time Kvazaar HEVC [12] encoder and the OpenHEVC decoder [13], respectively.
The rest of this paper is organized as follows-Section 2 presents the background related to the HEVC standard and selective video encryption. The proposed selective encryption of IPM and ROI encryption in HEVC are investigated in Sections 3 and 4, respectively. Performance evaluations and associated discussion are given in Section 5. Finally, Section 6 concludes the paper and gives some perspectives for the future work.

HEVC Standard
The emerging tools, defined in the HEVC standard, involve larger coding blocks, quad-tree block partitioning, more precise Intra and Inter predictions, optimized entropy coding, and the new in-loop Sample Adaptive Offset (SAO) filter. The HEVC video frame is divided into square Coding Tree Unit (CTU) of fixed size, from 16 × 16 up to 64 × 64 pixels. Each CTU can be recursively divided through a quad-tree partitioning method to Coding Unit (CU). In YUV color representation, the CUs consist of three Coding Block (CB), one for luma and two for chroma. The decision between intra or inter prediction is carried out at the CU level. CUs are predicted in intra mode from reconstructed neighbouring samples in the same slice. For I (Intra coded) slices, only intra prediction mode is used, whereas intra or inter prediction modes can be used in P and B slices [1,14]. In this section, we focus on three HEVC characteristics-entropy coding, Intra prediction mode, and parallelization approaches.

HEVC Entropy Coding
HEVC coding model identifies Context-Adaptive Binary Arithmetic Coding (CABAC) for entropy coding. The CABAC technique is composed of three main tasks-binarization, context modeling, and arithmetic coding [15]. These three functions are shown in Figure 1. The binarization function transform the syntax elements to binary symbols (bin). Five binarization methods are identified in HEVC-Unary (U), Truncated Unary (TU), Fixed Length (FL), Truncated Rice code with an adaptive context p (TRp), and the k t h-order Exp-Golomb (EGk) codes. Subsequently, the context modeling updates the probabilities of bins and, finally, the arithmetic coding compresses the bins into bits corresponding to the estimated probabilities. The arithmetic coding can be performed with context coding or with bypass coding. The first mode makes use of the estimated probabilities of syntax elements whereas the second one considers each bin with an equal probability of 0.5.

Intra Prediction Modes
The HEVC encoder permits higher coding efficiency partly by offering 35 IPMs. These modes consist of one Planar (mode 0), one DC (mode 1), and 33 Angular modes (modes 2-34). For an effective coding of the 35 IPMs, a shortlist of the three Most Probable Mode (MPM) is defined in HEVC specifications. This list is derived from the IPMs of the neighbouring blocks and other fixed IPM. Three syntax elements are used to signal the IPM for a luma Prediction Block (PB) in the bitstream. As shown in Table 1, the first flag signals if the MPM are used, the second flag determines the first MPM, and the third flag references which one of the two last MPM is selected. Therefore, MPM0, MPM1, and MPM2 are coded by 2, 3, and 3 bits, respectively. The first bit in red color is coded using a CABAC context while other bins are bypass coded. The residual 32 modes out the MPM list are coded by a FL code with 5-bin that are bypass coded. In HEVC, an adaptive scanning method of TCs is utilized for the block of sizes of 4 × 4 and 8 × 8 to gain from the statistical distribution of the active coefficients in 2-D transform blocks. For modes (6)(7)(8)(9)(10)(11)(12)(13)(14), vertical scan is used, horizontal scan for modes (22)(23)(24)(25)(26)(27)(28)(29)(30), and diagonal scan for the rest of the modes. The chroma IPMs are derived from the luma IPMs if the Derived flag is set to 1. Otherwise, the chroma IPMs are then encoded by three bits [1,16]. Table 2 shows the derivation process of the chroma IPMs. If the derived chroma intra mode is congruent to the initial chroma intra mode, then the Intra angular mode (i.e., mode-34) is used for the chroma, otherwise, the derived one is used.   [15,17,18] for parallel encoding/decoding of a single picture. The input frame can be divided into various tiles each of them comprises an integer number of an individually decodable Coding Tree Block (CTB). The number of tiles and the location of their boundaries can be defined consistently for the entire sequence or changed from picture to picture. Besides, the CABAC context is set at the starting of each tile. Tiles allow a flexible classification of CTU. in addition, tiles provide a favored correlation of pixels and a excellent coding efficiency over slices as they do not have a header information.

Selective Video Encryption
Nowadays, a set of encryption algorithms has been devoted for HEVC videos. Shahid et al. [19] introduced a joint compression and SE solution that lies on CABAC bin strings. Hamidouche et al [9,20] published a selective chaos-based crypto-compression system for HEVC and its scalable extension Scalable High efficiency Video Coding (SHVC) [21]. Boyadjis et al. [22] presented an extended SE solution for H.264/AVC and HEVC streams. It solves the main security issues of SE: the security of contents concern to the amount of information leak over a secured video. The contribution in Reference [22] handles both numerical and visual enhancements of the encryption performance concerning some state-of-the-art solutions.
Many studies have recognized the encryption of ROI in the video content. Peng et al. [23] offered an encryption proposal for human faces in H.264/AVC video. This solution is based on Flexible Macroblock Ordering (FMO) and chaos. Dufaux et al. [24] presented an efficient methodology to encrypt ROI using code stream-domain encryption. Research in Reference [25] facilitated rectangular region privacy by de-recognising faces. These approaches ensure that face recognition software cannot reliably recognize de-identified faces, even though part of the facial details are protected. In Reference [26], the writer examined the privacy protection in H.264/Scalable Video Coding (SVC) [27]. This solution tracks face areas (ROI) first and then encrypts them in the transform domain by scrambling the sign of the non-zero TCs at all SVC layers.
All of these studies do not take into account the specific HEVC tiles representations. In addition, no solution has investigated the Luma and Chroma IPMs encryption performances. This is the first study that handles the encryption of Luma and Chroma IPMs in HEVC standards. In the following sections, we present our proposed solution based on IPMs and chaos-based encryption systems as well as the ROI encryption implementation in HEVC encoder.

Intra Prediction Parameters Encryption
In HEVC standard, there are three scanning orders of the quantized TCs. The scanning order is obtained for Intra blocks from the IPM. The proposed research encrypts the IPM with keeping the original scanning order of the modes (the order before encryption). This solution makes the IPM encryption format-compliant with HEVC, that is, decodable with any standard HEVC decoder.
Distinct from the encryption of other syntax elements, the encryption of the IPMs is processed before the entropy coding and, thus, may reduce Rate-Distortion (RD) performance.

CABAC Level Encryption
The proposed encryption solution is realized at the CABAC bin string level for a set of sensitive HEVC parameters including MVs, MV signs, TCs, and TC signs. These syntax elements are processed as illustrated in Figure 1. Selectively encrypted HEVC bitstreams are format compliant and accomplish the real-time requirements.

Encryption System Based on Chaos
Our encryption system relies on chaos, which is a state of dynamical systems whose apparently-random states of disorder and irregularities are often governed by deterministic laws that are highly sensitive to initial conditions. For a particular syntax element, the key-stream generator will generate the required key streams needed to get the ciphering data. The key stream generator is fostered from our previous work [10,11]. The internal state, which involves the main cryptographic complexity of the system, is consists by two third-order recursive filters. The first recursive cell contains a discrete Skew tent map and the second one contains a discrete piecewise linear chaotic map. These chaotic maps are performed as non-linear filters. A new initial vector IV value is produced in each keystream generator call. This value allows to produce a different key stream sequence on each generator call. The detailed structure and the cryptographic security analysis of the key stream generator is elaborated in Reference [10]. The scheme of our key stream generator based on a chaotic map is depicted in Figure 2. The proposed system use the chaos based stream cipher to encrypt the sensitive bits in the frame. This encryption algorithm, as mentioned, relies on an efficient chaotic generator that uses two chaotic recursive filters, a technique of disturbance and chaotic multiplexing. This is a kind of stream cipher encryption. Indeed synchronous stream cipher based on a chaotic generator have been used. The sender and the receiver needs the shared secrete key to operate the chaos based generator in order to produce the key-streams used in the encryption and the decryption process the structure of this encryption system is figured out in Figure 3. The equations of the Discrete Skew Tent and Discrete PWLCM maps are respectively given by [10]: Discrete Skew Tent Map: (1) Discrete PWLCM map: The values produced X s [n], X p [n] by the recursive cells in the internal state are entered to the output function. Then, the output sequence Xg(n) is obtained using a chaotic multiplexing controlled by the chaotic sequence Xth = X1_s(n − 1) ⊕ X1_p(n − 1) and by a threshold Th = 2 N−1 , as shown in and Equation (3), or by xoring X1_s and X1_p as clarified in Equation (4).
To evaluate the statistical performances of the keystream produced, we also use one of the most popular test for investigating the randomness of binary data, namely the NIST statistical test [28]. This test is a statistical package that consists of 188 tests and sub-tests that were proposed to assess the randomness of arbitrarily long binary sequences. These tests focus on a variety of different types of non-randomness that could exist in a sequence. We generated 100 different binary sequences, each with a different secret key, and 31,250 samples (corresponding to 1 million bits); we used the NIST test on all of these entities. For each test, a set of 100 P_value is produced and a sequence passes a test whenever the P_value ≥ α = 0.01, where α is the level of significance of the test. A value of α = 0.01 means that 1% of the 100 sequences are expected to fail. The proportion of sequences passing a test is equal to the number of P_value ≥ α divided by 100. In Figure Figure 4 we present the obtained proportion versus test for delay 1. As we can see, all the 188 tests and sub-tests pass the NIST. Notice that the minimum pass rate for each statistical test, with the exception of the random excursion variant test, is approximately= equal to 0.960150 for 100 binary sequences. The minimum pass rate for the random excursion variant test is approximately 0.952091 for a sample size =62 binary sequences. Our algorithm passed all the NIST tests as shown in Table 3. The encryption of syntax elements at the CABAC level, including MV differences, MV signs, TCs, and TC signs, is given by the the following formula: where P i refers to the syntax elements, C i the encrypted syntax elements and X i the key stream bits. in addition, the encryption of the luma and chroma IPMs is carried out as follows: Let N be the number of items in the vector V = [IPM 1 , IPM 2 , · · · , IPM N ], V ∈ Z N , X i the key stream bits generated by key-stream generator, and j the index of the IPM to encrypt in the vector V. The new value, IPM encr , produced for the current IPM of index j is given as follows : The decryption algorithm is performed by inverse operations of (5) and (6). Finally, the secret key that is used to initialize the chaotic generator must be shared between the encoder and the decoder.

ROI Encryption System Based on Tiles
The presented ROI encryption is based on the tile concept inserted in HEVC. This technique divides the video image into various rectangles with integer number of blocks, where Intra prediction and entropy coding dependencies are cracked at the tile borders. The proposal encrypts only the tiles comprising the ROI, while the non ROI tiles stay clear (not encrypted). A group of most sensitive HEVC parameters, including MVs, MV signs, TCs, and TC signs, after that, they are encrypted at the CABAC bin string level to decrease the visual quality of the ROI. This is done in format compliant with HEVC and with a constant bitrate of the encrypted videos. In addition to these four parameters, we implemented the HEVC format compliant encryption of IPMs, which may come with a slight raise in the bit rate.

Encryption Propagation in Inter Video Coding
The merge mode in HEVC deduces MVs from a list of spatial neighbouring and temporal candidates. Referring to these candidates can broadcast encryption from the encrypted tiles to the background, when the ROI is not correctly decrypted. Thus, we force the temporal candidates of the background tiles to be inside the background region in the reference frame. In order to prevent the propagation of encryption outside the ROI tile, two non-normative encoding constraints are used in the Kvazaar encoder (as shown in Figure 5): 1.
The MVs of the predicted block are restricted to point only to the co-located tile of the reference frame.

2.
The in-loop filters are disabled across the tile boundaries.
These two constraints tend to have a negative influence on RD performance, depending on the resolution, tiling configuration, and the video content. However, they ensure a safe interpolation process at the tile boundaries.

Tiles
In loop filters disabled

Experiments
The proposed SE encryption solution is implemented in the HEVC test Model (HM) version 16.7 [29] and tested using All Intra (AI) and Random Access (RA) coding configurations. The ROI-based encryption and decryption algorithms are integrated in the Kvazaar HEVC real time encoder and OpenHEVC decoder, respectively. In this experiment, Nine video sequences from different classes and categories were used. These videos, of 10 s duration each (except PeopleOnStreet of 5 s), taken from HEVC common test conditions [16], are listed in Table 4. They are jointly encoded and encrypted, in both Intra and Inter coding configurations, at four Quantization Parameter (QP)s, where QP values ∈ {22, 27, 32, 37}. The encrypted videos are encoded with two uniform tiling configurations-4 × 3 (i.e., four horizontal by three vertical repartition) and 4 × 4. The same encoder configuration, without tiles and encryption, is used as a reference. The processor used in these experiments has 32-bit 4-core Intel Core i5 processor, running at 2.60 GHz with 16 GB of main memory. The operating system is Ubuntu 14.04 Trusty Linux distribution. In the following, we elaborate in detail the performance of the proposed encryption system. It is noteworthy that two HEVC platforms are used in this study (HM and Kvazaar /OpenHEVC ). The first one is the selective encryption (whole frame), which is performed under the HM encoder/decoder. Several objective measurements have been applied-Peak Signal-to-Noise Ratio (PSNR), Structural SIMilarity (SSIM), IPMs Bjøntegaard Delta Bit Rate (BD-BR) [30] evaluations. The second one is the ROI-based encryption that is implemented using Kvazaar encoder/OpenHEVC decoder and other metrics to assess the encryption that has been used-complexity evaluation, BD-BR with PSNR and SSIM.

Video Quality Metrics
PSNR and the SSIM are applied to validate the quality of the encrypted videos. The quality of the video after encryption gives an indication of the level of the visual content and, thus, the encryption solution consistency. The results of these two metrics, using original and encrypted ROI solutions, are given in Tables 5 and 6. The average PSNR inside the ROI, for all encrypted sequences, still below 11.2 dB and the SSIM values are below 0.22. According to quality concepts, the obtained results give a strong indication that the video quality is degraded very much. In addition, despite the diversity of QP, video quality is degraded at different bit-rates. Furthermore, the known plain-text attack is impracticable. In Table 7, we provide a comparison in terms of psnr,ssim! (psnr,ssim!) objective metrics, between the proposed encryption solution and state-of-the-art encryption research examples. Our SE solution has a lower PSNR value compared to Reference [5] with fewer SSIM values than those given in References [5,22]. Furthermore, we applied PSNR and SSIM matrices, depicted in Tables 8  and 9, for two different encryption levels-(TC, TC signs, MV, MV signs) and (TC, TC signs, MV, MV signs, IPMs) in AI and RA coding configurations. When we put both encryption levels together the encryption is powerful and also the results show the impact of quality degradation of IPMs encryption on the video sequences.

BD-BR Rate Evaluation
We consider the BD-BR metric [30], which indicates the differences between two bit-rate-PSNR curves. The process of the encoding is carried out for both cases of coding Inter and Intra considering 4 × 4 and 4 × 3 tile repartitions, taking into account the limitations with MVs and the in-loop filters disabling across the tile edges. In Tables 10 and 11 we provide the magnitude of RD losses with Intra and Inter coding configurations of the two tiles configurations. As we noticed, the overhead in the bit rate range of 2% and 18.23% comes from the restrictions of the MVs depending on the coding configuration (Inter and Intra), video content and number of tiles within the frame.
The BD-BR loss for 4 × 4 tiles repartition in Inter coding is larger than the loss in Intra coding configuration and extends to 12.33% and 5.36%, respectively. The BD-BR loss for 4 × 3 tiles repartition is less than the loss for 4 × 4 tiles in both coding configurations. For example, the loss in BD-BR of Parkscene (1920 × 1080) video sequence with 4 × 3 and 4 × 4 tiles with Inter coding configuration is around 8.55% and 9.81%, respectively. This variation in loss is comes from the limitations related to tile coding: the in-loop filtering disabling across tiles and the restrictions on MVs in the higher number of tiles configuration (4 × 4). While, using Intra coding configurations the BD-BR loss is remains minimal and it doesn't transcend 4.13% and 5.16%, respectively. For example the BD-BR loss for PeopleOnStreet (2560 × 1600) video sequence is 5.01% and 3.11% in Inter coding and 3.55%, 2.04% in Intra coding configuration. The restrictions in coding slightly reduce the BD-BR efficiency and this is due to the nature of video sequence content and resolution.
The IPMs encryption makes a Little bit-rate increase, The expense of increase is more in Inter coding configurations wherase in Intra blocks are less frequent than in Intra configuration. Figure 6 shows the RD performance using the average bit-rate variation between two bit-rate-wPSNR (weighted PSNR calculated after the right decryption) curves for BasketballDrive video sequence with and without encryption. As depicted in Figure 6, the IPMs encryption conducts to a minimal BD-BR loss.  Table 11. BD-rate and complexity of the proposed encryption system in Intra and Inter coding. Nine video sequences, encoded by Kvazaar (4 × 3 tile configuration), are used.

Encryption Quality
Encryption Quality (EQ) is a measure of the difference between the frequency of repetition for each gray level using encryption and without using it. The maximum EQ value is calculated using the following two equations, as given in [31], see Appendix A: where o i (C) are the observed occurrence for the gray level i in the encrypted frame C, and o i (P) are the observed occurrences of the same gray level i in the plain frame P.
where L and W are the hight and the width of the gray frame. The larger the EQ value, the better the encryption security is. The maximum EQ value of a given video frame of Kimono1, PeopleOnStreet and Vidyo1 sequences are equal to 16,136, 31,875 and 7171, respectively [32]. In Table 12 we provide the EQ value. The results indicate that the EQ values of our encryption solution with two video sequences (Kimono1 and PeopleOnStreet) are higher compared to results given by EQ [32]. This enhancements is brought by the IPM encryption that not examined in [32].

Entropy Analysis
The Information entropy is the probability of occurrence for each symbol in the video frame [33]. the value of the entropy should be 8 for the truly random frame. Table 13 shows that the probability of the occurrence of each encrypted block in the encrypted video frame number 15 by the proposed chaos-based SE scheme is near to the theoretical value 8. This indicates that the proposed scheme is secure and robust against the entropy attack.

Key Security
From the generated sequences, it is impossible to find the secret key; this is because of the structure of the chaotic generator which also includes a chaotic switching. The knowledge of part of the secret key is not very useful for an attacker because of the intrinsic property of the chaotic signal, which is extremely sensitive to the secret key. The size of the secret key, formed by all the initial conditions and by all the parameters of the system, varies from 299 bits, with delay = 1, to 555 bits, with delay = 3. This means that the brute force attack is impracticable.

Visual Analysis
Visual encryption investigation is applied to assess the unrecognizable level of the videos after encryption. Encrypted video is marked as of top level of visual security if the deformity of the encrypted video is too messy to be realized. We applied the Edge Differential Ratio (EDR) which evaluates the edges variations between the original and the encrypted images, with RA encoding configuration [34], using the Laplacian of Gaussian method [35]. The proposed encryption method is highly efficient when the edges of the encrypted frames are not remarkable. The EDR is calculated as: where P E and C E are the edge detected binary matrix for the plain and cipher frame, respectively. Figure 7 illustrates the visual impact of the proposed solution on the frame content. Figure 7b shows the distortion on visual content quality of the frame. Edges in the encrypted frame (Figure 7d) are completely affected compared to edges in the original frames (Figure 7c). The common step for identifying and tracking the ROI in the video is to split the HEVC frame into tiles where all ROI are included in ROI tiles and the background in separated non ROI tiles [36]. In Figure 8, the tiles that include a human face represent the ROI tiles and the other tiles represent the background tiles. The proposed encryption solution performs a selective encryption of ROI tiles by encrypting the most sensitive HEVC syntax elements to decrease the visual quality of the ROI as described in Sections 3 and 4. Based on this figure, we can observe that the proposed encryption solution decreases the quality of the ROI zone while the background remains clean even in inter coding configuration. Videos decoded and decrypted with the correct key are illustrated on the right side while being decoded without decryption on the left side. The proposed real time selective solution performs a secure protection of privacy in the HEVC video content with a little overhead in bit rate and coding complexity. Traditional algorithms are more complex and require a longer time for execution, which is not suitable for real time applications such as live TV. The proposed system aims to gain a deep understanding of video data security of multimedia technologies and to provide security for real time video applications using selective encryption for HEVC. Although suggested in a number of specific cases, selective encryption could be much more widely used in consumer electronic applications ranging from mobile multimedia terminals through digital cameras. Furthermore, this solution can be used in free space optical communication applications. In Table 14, we made a comparative study with other selective encryption solutions. Our algorithm encrypts most sensitive parameters in an HEVC in format compliant manner. Xu [37] IPM, MVDs, T1s, signs of the NZ coefficients yes no Chaos Abomhara [38] I frame no no AES Shahid [19] T1s, NZ level yes no AES Fei [23] IPM, MVD, Signs of residual yes yes Chaos Sung [39] Motion vector yes yes RC4 Wei [40] NALUs yes yes RC4 Wang [41] IPM, MVD, Quantization coefficients yes yes Hash and AES Shuli [42] IPM

Subjective Evaluations
The subjective experiment was performed in the IETR laboratory psychovisual room, and was aligned with the ITU-R BT.500-13 Recommendation [43]. In this evaluation we used a display screen Full HD 32 inch Samsung UN32J5003 to view the sequences of videos. In this experiment, fifteen observers, 10 men and 5 women took part in this test, with an age between 20 and 40 years. All the subjects were tested for color blindness and visual acuity depending on Ishihara and Snellen charts, respectively, and have a visual acuity of 10/10 in both eyes with or without correction, as figured out in [44]. We considered five video sequences from Table 4 (FourPeople, Kimono1, BasketballDrive, BQSquare, Cactus). The selective encryption and encoding is applied by using an HM(16.7) encoder with RA encoding. The selective encryption is performed in two levels-(TC,TC signs, MV, MV signs) and All (TC,TC signs, MV, MV signs, IPM). Finally, these coding configurations come with 40 encrypted video sequences, with various QP and resolutions.

Design and Procedure
The Double Stimulus Continuous Quality Scale (DSCQS) method [43] has been applied in our subjective quality evaluation experiment. Each encrypted video was showed twice to the observer as long as its original version. The observers will judge the visibility degree of the content of the encrypted videos numerically. That means, each participant should specify a visibility score to each of the 40 test videos, concerning to a rating scale, ranging from 1-meaning the video content is Completely Invisible-to 5-which means that the video content is Clearly Visible. After each test condition, a devoted Graphical User Interface (GUI) is shown on the screen for about 9 s during which the observer gives and then confirms their judgement. The videos were shuffled in such an order that two consecutive sequences must be from various configuration categories in order to remove the observer's memory effects.

Data Processing
The first step in the results analysis is to calculate the average score of Mean Opinion Score (MOS) for each video used in the experience. This average is given by Equation (10).
where s ijk is the score of participant i for degree of visibility j of the sequence k and N is the number of observers. In order to better evaluate the reliability of the obtained results, it is advisable to associate for each MOS score a confidence interval, usually at 95%. This is given by Equation (11). Scores respecting the experiment conditions must be contained in the interval [MOS jk − IC jk , MOS jk + IC jk ].

Subjective Scores
The subjective results scores of all participants, collected through the dedicated GUI, have been used for the perceptual encryption measurement. Figure 9 illustrates the MOS for two encryption configurations, four video sequences coded with the HM software at QP22 in RA configuration. Subjects scores range generally between (barely visible) and (completely invisible) for the first encryption scheme. This indicates that the visibility of the human is considerably decreased by the impact of our proposed SE solution. Indeed, the obtained results indicate that the content of the video is invisible. Furthermore, subjects attempt to guess the type of video context without the ability to see any detail of the presented video. We noticed a Little variation on MOS, relying on the video content and the used QP. The impact of IPM encryption is powerful on the content visibility. Indeed, the main observers scores steered to Completely Invisible when the IPM encryption has been added to the first encryption level (scheme). The subjects can barely see a few parts of the video (without ability to decide the general context of the shown video). Results depend strongly on the video classes and video contents have a strong impact. BQSquare (Classe D) and Cactus are completely invisible to all subjects, with MOS 1, and very few variations depending on the used QP. Moreover, BasketballDrive shows low visibility scores due to its strong movement character. Curves of this video were dramatically reduced when we added the IPMS encryption.

Statistical Analysis
The Analyse of Variance (ANOVA) [45] was used to perform statistical study. In fact, ANOVA permits us to examine if the variance in visibility scores comes from the intended variation of experimental variables (i.e., QP, Class, Encrypted Scheme and Content), or just as a result of chance. Table 15 implies that only the 'Encryption Scheme' parameter has an important impact on the subject's scores with P-value < 0.0001 (a factor is considered influencing if the P-value < 0.05).  Table 10 reports the complexity overhead of encryptions for 4 × 4 and 4 × 3 tile configurations on our 2.6 GHz Core i5 processor, respectively. For 4 × 4 tile configuration, the average encoding time slows down by 2.6% in Intra coding and 3.3% in Inter coding. The respective decoding times are 1.6% and 2.1% higher. Changing the tile configuration to 4 × 3 decreases respective complexity overheads to 2.2% and 1.6% for encoding and 1.5% and 1.1% for decoding. These results confirm that the proposed SE solution can be performed without noticeable complexity performance compromises. This is especially important in embedded and mobile devices that have restricted processing power, as we can notice in Figures 10 and 11 Figure 11. Encoding time vs. complexity overhead for 9 video sequences.

Conclusions
This paper proposed a selective encryption solution that protects privacy by encrypting merely the ROI in the HEVC video content and selective encryption of the whole sensitive parts in videos. The ROI is extracted through an independent HEVC tile concept. The ROI encryption is based on chaos-based generator and it is performed at the CABAC bin string level for the most sensitive HEVC parameters, including motion vectors, transform coefficients, and intra prediction modes. The format compliant encryption of IPM has been also investigated in this paper which introduces a slight bitrate increase. The encrypted bit stream can be decoded with a standard HEVC decoder and a privacy key is only needed for the decryption. However, there is some bit rate overhead in the HEVC encoding process in order to prevent the propagation of the encryption outside the ROI. The proposed encryption and decryption algorithms were integrated into HM reference software in order to validate their conformance with the HEVC standard. Respectively, their diminutive impact on coding speed was verified as a part of the real-time Kvazaar HEVC encoder and OpenHEVC decoder. Objective rate-distortion-complexity examinations indicated that the proposed solution performs a secure protection of privacy in the HEVC video content with a little overhead in bit rate and coding complexity. It also prevents unexpected behaviour of the decoder.