A Face Image Virtualization Mechanism for Privacy Intrusion Prevention in Healthcare Video Surveillance Systems

: Closed-circuit television (CCTV) and video surveillance systems (VSSs) are becoming increasingly more common each year to help prevent incidents / accidents and ensure the security of public places and facilities. The increased presence of VSS is also increasing the number of per capita exposures to CCTV cameras. To help protect the privacy of the exposed objects, attention is being drawn to technologies that utilize intelligent video surveillance systems (IVSSs). IVSSs execute a wide range of surveillance duties—from simple identiﬁcation of objects in the recorded video data, to understanding and identifying the behavioral patterns of objects and the situations at the incident / accident scenes, as well as the processing of video information to protect the privacy of the recorded objects against leakage. Besides, the recorded privacy information is encrypted and recorded using blockchain technology to prevent forgery of the image. The technology herein proposed (the “proposed mechanism”) is implemented to a VSS, where the mechanism converts the original visual information recorded on a VSS into a similarly constructed image information, so that the original information can be protected against leakage. The face area extracted from the image information is recorded in a separate database, allowing the creation of a restored image that is in perfect symmetry with the original image for images with virtualized face areas. Speciﬁcally, the main section of this study proposes an image modiﬁcation mechanism that inserts a virtual face image that closely matches a predetermined similarity and uses a blockchain as the storage area.


Introduction
Closed-circuit television (CCTV) and video surveillance systems (VSSs), installed for incident/ accident prevention in public places or investigation, are becoming increasingly more common each year. With their numbers on the rise, controversy over the privacy violation of objects being recorded is also rising [1][2][3]. Accordingly, the public's interest in intelligent visual surveillance systems (IVSSs) is growing, as the system combines technologies that allow the understanding of the scenes of incidents/accidents and the behavioral patterns of the objects, to predict incidents/accidents and warn the VSS users, and prevent the leakage of personal information [4][5][6][7]. An IVSS thus records the persons in areas targeted for recording to help understand the locations and detect the faces of the persons so that they can be used as the basis for assessment and decision-making [8][9][10][11].
Detecting a person's face and using the image, later on, can compromise the personal information of the individual that was recorded in the original video data. Hence, an IVSS implements de-identification Symmetry 2020, 12, 891 2 of 15 to protect the privacy of the objects recorded on the system [12][13][14]. For de-identification, a certain portion(s) of the visual information is removed to prevent the guessing of the identity of a particular individual(s) or is replaced with other information to protect the personal information of the object(s). The most common cause of the implementation is masking, a type of de-identification, which inserts non-meaningful strings into the personal information-carrying portions/frames of the video and changes the identification information in the visual data [15][16][17]. The identification information contained in the altered visual data is safeguarded against privacy infiltration even if it is leaked to outside entities [18][19][20]. Security against personal information breaches in such images can be applied in a variety of fields, particularly in the medical environment, where they are often located and are continuously photographing in the same location; thus, the identification of patients is a valid technique to protect the privacy of patients [21][22][23][24]. However, statistical processing using the recorded visual data or legitimate requests for using the data does not allow the use of masked visual information. In the foregoing, an alternative is to use the original video data as the basis of generating virtual data, with the new information helping to prevent the leakage of the original visual data and allows for statistical processing using the visual information to take place [25][26][27]. The main part of this study follows a predetermined similarity to generate a virtual face image and inserts the image into recorded visual information so that it can identify public information with a virtualized face image. When reconstruction of the image is required, the original image and the reconstructed image can be perfectly symmetrical using the original face image recorded in the database. This paper is structured as follows: Section 2 examines the current status of techniques and studies related to the de-identification of visual data; Section 3 introduces the face image modification mechanism; and Section 4 presents the differences between the commercialized or previously introduced techniques and the proposed system.

Examination of the Existing Data Privacy Preservation Techniques
Intelligent video surveillance systems (IVSSs) implement various techniques to preserve the privacy of users [28][29][30][31]. The information presented in the following describe privacy protection methods that are typically used in IVSSs [32][33][34][35].

Blurring
Using weighted masks in the visual data, blurring multiplies the pixel values in the data by the weighted means/averages to obfuscate a certain portion(s) of the visual data. Typically, blurring is achieved by using the gaussian function to set the sigma parameter. Once the visual data are subject to blurring, they cannot be recovered to their original state [36][37][38]. This blurring technique blurs the edges, creating a clump of photos, and generally is applied to personal information about the subjects taken from the images to prevent the identification of targets by arbitrarily crushing them [39][40][41][42]. This effect becomes more blurred relative to the size of the mask applied, but as the size of the mask increases, the complexity of the calculation increases, requiring greater resource consumption. Figure 1 is a brief representation of the general logic of the blurring technique. Blurring makes a different value by applying a convention operation to a mask weighted against a given unit of zone.
Symmetry 2020, 12, x FOR PEER REVIEW 2 of 15 implements de-identification to protect the privacy of the objects recorded on the system [12][13][14]. For de-identification, a certain portion(s) of the visual information is removed to prevent the guessing of the identity of a particular individual(s) or is replaced with other information to protect the personal information of the object(s). The most common cause of the implementation is masking, a type of deidentification, which inserts non-meaningful strings into the personal information-carrying portions/frames of the video and changes the identification information in the visual data [15][16][17].
The identification information contained in the altered visual data is safeguarded against privacy infiltration even if it is leaked to outside entities [18][19][20]. Security against personal information breaches in such images can be applied in a variety of fields, particularly in the medical environment, where they are often located and are continuously photographing in the same location; thus, the identification of patients is a valid technique to protect the privacy of patients [21][22][23][24]. However, statistical processing using the recorded visual data or legitimate requests for using the data does not allow the use of masked visual information. In the foregoing, an alternative is to use the original video data as the basis of generating virtual data, with the new information helping to prevent the leakage of the original visual data and allows for statistical processing using the visual information to take place [25][26][27]. The main part of this study follows a predetermined similarity to generate a virtual face image and inserts the image into recorded visual information so that it can identify public information with a virtualized face image. When reconstruction of the image is required, the original image and the reconstructed image can be perfectly symmetrical using the original face image recorded in the database. This paper is structured as follows: Section 2 examines the current status of techniques and studies related to the de-identification of visual data; Section 3 introduces the face image modification mechanism; and Section 4 presents the differences between the commercialized or previously introduced techniques and the proposed system.

Examination of the Existing Data Privacy Preservation Techniques
Intelligent video surveillance systems (IVSSs) implement various techniques to preserve the privacy of users [28][29][30][31]. The information presented in the following describe privacy protection methods that are typically used in IVSSs [32][33][34][35].

Blurring
Using weighted masks in the visual data, blurring multiplies the pixel values in the data by the weighted means/averages to obfuscate a certain portion(s) of the visual data. Typically, blurring is achieved by using the gaussian function to set the sigma parameter. Once the visual data are subject to blurring, they cannot be recovered to their original state [36][37][38]. This blurring technique blurs the edges, creating a clump of photos, and generally is applied to personal information about the subjects taken from the images to prevent the identification of targets by arbitrarily crushing them [39][40][41][42]. This effect becomes more blurred relative to the size of the mask applied, but as the size of the mask increases, the complexity of the calculation increases, requiring greater resource consumption. Figure  1 is a brief representation of the general logic of the blurring technique. Blurring makes a different value by applying a convention operation to a mask weighted against a given unit of zone.

Mosaic Masking
With mosaic masking, the pixel values of a certain region(s) in the video data are summed and then averaged. This average is then entered into the same region(s) unilaterally so that the original visual information in the region(s) cannot be recognized by third parties. Image mosaic applies the average parameter to the entirety of the region(s), making it difficult to recover the original images [43][44][45][46]. Such mosaic techniques can be used as a means to mask the personal information existing in images; in turn, the restoration of these techniques is currently being developed by the development of artificial intelligence [47][48][49][50]. One notable example is Google's artificial intelligence-applied mosaic-replicating technique, in which photos with mosaics are restored to the point where it is difficult to identify with a third party. This methodology suggests that while accurate restoration may be difficult, it may be possible to leak personal information in that it can be added. Figure 2 is a brief description of the mosaic technique and shows that a certain area is synthesized by averaging the average value as one area.

Mosaic Masking
With mosaic masking, the pixel values of a certain region(s) in the video data are summed and then averaged. This average is then entered into the same region(s) unilaterally so that the original visual information in the region(s) cannot be recognized by third parties. Image mosaic applies the average parameter to the entirety of the region(s), making it difficult to recover the original images [43][44][45][46]. Such mosaic techniques can be used as a means to mask the personal information existing in images; in turn, the restoration of these techniques is currently being developed by the development of artificial intelligence [47][48][49][50]. One notable example is Google's artificial intelligence-applied mosaic-replicating technique, in which photos with mosaics are restored to the point where it is difficult to identify with a third party. This methodology suggests that while accurate restoration may be difficult, it may be possible to leak personal information in that it can be added. Figure 2 is a brief description of the mosaic technique and shows that a certain area is synthesized by averaging the average value as one area.

Removal and Transformation
Removal and transformation either completely remove or transform the pixels containing the identification information of the object(s) recorded in the visual data. The technique intentionally compromises the original data. Such arbitrary removal or modification of the part where the personal information of the image exists will make it impossible to restore it to the original without additional measures to restore it [51][52][53][54][55]. To restore the image removed from within the image, the removed image must be kept separately and the images stored separately must be identified as those of the original image [56][57][58][59]. However, the harder the video restoration process becomes, the more computing resources are required and the more maintenance costs are required. Figure 3 illustrates the application principle of the Removal and Transformation technique, which shows the process of removing or synthesizing a certain pixel region in the image, removing the existing value and changing it to a completely new value.

Encryption
Encryption implements encryption keys to visual data to convert the visual information in the recorded images into cryptograms. Any authorized users with decryption keys to the coded texts can readily obtain the original visual information. Hence, additional safeguarding technologies are required to prevent decryption by unauthorized third parties [60][61][62]. Figure 4 shows the process of

Removal and Transformation
Removal and transformation either completely remove or transform the pixels containing the identification information of the object(s) recorded in the visual data. The technique intentionally compromises the original data. Such arbitrary removal or modification of the part where the personal information of the image exists will make it impossible to restore it to the original without additional measures to restore it [51][52][53][54][55]. To restore the image removed from within the image, the removed image must be kept separately and the images stored separately must be identified as those of the original image [56][57][58][59]. However, the harder the video restoration process becomes, the more computing resources are required and the more maintenance costs are required. Figure 3 illustrates the application principle of the Removal and Transformation technique, which shows the process of removing or synthesizing a certain pixel region in the image, removing the existing value and changing it to a completely new value.

Mosaic Masking
With mosaic masking, the pixel values of a certain region(s) in the video data are summed and then averaged. This average is then entered into the same region(s) unilaterally so that the original visual information in the region(s) cannot be recognized by third parties. Image mosaic applies the average parameter to the entirety of the region(s), making it difficult to recover the original images [43][44][45][46]. Such mosaic techniques can be used as a means to mask the personal information existing in images; in turn, the restoration of these techniques is currently being developed by the development of artificial intelligence [47][48][49][50]. One notable example is Google's artificial intelligence-applied mosaic-replicating technique, in which photos with mosaics are restored to the point where it is difficult to identify with a third party. This methodology suggests that while accurate restoration may be difficult, it may be possible to leak personal information in that it can be added. Figure 2 is a brief description of the mosaic technique and shows that a certain area is synthesized by averaging the average value as one area.

Removal and Transformation
Removal and transformation either completely remove or transform the pixels containing the identification information of the object(s) recorded in the visual data. The technique intentionally compromises the original data. Such arbitrary removal or modification of the part where the personal information of the image exists will make it impossible to restore it to the original without additional measures to restore it [51][52][53][54][55]. To restore the image removed from within the image, the removed image must be kept separately and the images stored separately must be identified as those of the original image [56][57][58][59]. However, the harder the video restoration process becomes, the more computing resources are required and the more maintenance costs are required. Figure 3 illustrates the application principle of the Removal and Transformation technique, which shows the process of removing or synthesizing a certain pixel region in the image, removing the existing value and changing it to a completely new value.

Encryption
Encryption implements encryption keys to visual data to convert the visual information in the recorded images into cryptograms. Any authorized users with decryption keys to the coded texts can readily obtain the original visual information. Hence, additional safeguarding technologies are required to prevent decryption by unauthorized third parties [60][61][62]. Figure 4 shows the process of

Encryption
Encryption implements encryption keys to visual data to convert the visual information in the recorded images into cryptograms. Any authorized users with decryption keys to the coded texts can readily obtain the original visual information. Hence, additional safeguarding technologies are required to prevent decryption by unauthorized third parties [60][61][62]. Figure 4 shows the process creating a cryptographic statement that can be restored but the original cannot be recognized by applying cryptographic keys to pixel values present in the image.

Related Research
Through the de-identification of images, this paper prevented leakage by third parties and further selected and synthesized virtualized face images in the images, making it difficult to identify the objects of images, while enabling the acquisition of general data, enabling digitized statistics. Thus, the main objective of the proposed mechanism was summarized and summarized for other relevant studies applying non-identification.
Uddin pointed out that recent video surveillance systems have increased significantly, generating large amounts of video data, and at the same time increasing the need for distributed processing of video data. Most of the existing solutions do not support more complex application scenarios while using the existing Client/Server framework to perform face and object recognition. It also pointed out that these frameworks were rarely processed in an extensible manner using distributed computing and that existing works do not support low-level distributed video processing APIs (Application Programming Interfaces). To address this problem, a distributed video analysis framework for intelligent video surveillance, called SIAT, was proposed, and SIAT introduced stateof-the-art distributed computing technologies to handle real-time video streams and batch video analytics to ensure scalability, efficiency and fault tolerance [63].
Brkić proposed a computer vision-based non-identification pipeline to help protect human privacy in video sequins by maintaining the naturalness and usefulness of the unidentified data and blurring faces. The proposed pipeline can be recognized when applying simple techniques, such as fade, by de-identifying features such as clothing, hair and skin color. Assuming a surveillance scenario, we combined background subtraction based on Gaussian blend with an improved version of the GrabCut algorithm to find and classify pedestrians and to change the appearance of segment pedestrians through neural technology algorithms that render pedestrian images in different styles using responses from deep neural networks [64].
Gros pointed out that the development of camera and computing equipment hardware has made extensive video data capture and storage simpler, and provides ample opportunity to share video sequins, pointing out the need for automated ways of de-identifying images, especially facial areas, to protect the privacy of subjects shown in the video. To solve this problem, a wide range of experiments have demonstrated that pixelation and blurring are very poor in privacy protection, while greatly distorting data; the Google Algorithm has proposed a new technique by combining model-based face image parameterization with official privacy models [65].

Proposed Mechanism for Changing Face Information to Prevent Privacy Infiltration
The mechanism proposed in this section of the study (hereinafter, the "proposed mechanism") substitutes a face image(s) contained in the recorded visual data with a virtual face image(s) stored in the existing database (DB). Hence, the foregoing is named "virtualization", and the face image information stored in the DB is abbreviated as "virtual face information" for this study. The proposed mechanism 1) detects face information contained in the recorded visual data; 2) generates virtual face information by using virtualization random numbers; and 3) then replaces the face information in the visual data with the virtual face information with comparable similarities to only allow the

Related Research
Through the de-identification of images, this paper prevented leakage by third parties and further selected and synthesized virtualized face images in the images, making it difficult to identify the objects of images, while enabling the acquisition of general data, enabling digitized statistics. Thus, the main objective of the proposed mechanism was summarized and summarized for other relevant studies applying non-identification.
Uddin pointed out that recent video surveillance systems have increased significantly, generating large amounts of video data, and at the same time increasing the need for distributed processing of video data. Most of the existing solutions do not support more complex application scenarios while using the existing Client/Server framework to perform face and object recognition. It also pointed out that these frameworks were rarely processed in an extensible manner using distributed computing and that existing works do not support low-level distributed video processing APIs (Application Programming Interfaces). To address this problem, a distributed video analysis framework for intelligent video surveillance, called SIAT, was proposed, and SIAT introduced state-of-the-art distributed computing technologies to handle real-time video streams and batch video analytics to ensure scalability, efficiency and fault tolerance [63].
Brkić proposed a computer vision-based non-identification pipeline to help protect human privacy in video sequins by maintaining the naturalness and usefulness of the unidentified data and blurring faces. The proposed pipeline can be recognized when applying simple techniques, such as fade, by de-identifying features such as clothing, hair and skin color. Assuming a surveillance scenario, we combined background subtraction based on Gaussian blend with an improved version of the GrabCut algorithm to find and classify pedestrians and to change the appearance of segment pedestrians through neural technology algorithms that render pedestrian images in different styles using responses from deep neural networks [64].
Gros pointed out that the development of camera and computing equipment hardware has made extensive video data capture and storage simpler, and provides ample opportunity to share video sequins, pointing out the need for automated ways of de-identifying images, especially facial areas, to protect the privacy of subjects shown in the video. To solve this problem, a wide range of experiments have demonstrated that pixelation and blurring are very poor in privacy protection, while greatly distorting data; the Google Algorithm has proposed a new technique by combining model-based face image parameterization with official privacy models [65].

Proposed Mechanism for Changing Face Information to Prevent Privacy Infiltration
The mechanism proposed in this section of the study (hereinafter, the "proposed mechanism") substitutes a face image(s) contained in the recorded visual data with a virtual face image(s) stored in the existing database (DB). Hence, the foregoing is named "virtualization", and the face image information stored in the DB is abbreviated as "virtual face information" for this study. The proposed mechanism 1) detects face information contained in the recorded visual data; 2) generates virtual Symmetry 2020, 12, 891 5 of 15 face information by using virtualization random numbers; and 3) then replaces the face information in the visual data with the virtual face information with comparable similarities to only allow the identification of sex, age, race and such other public information. Such published information may then be used in statistics of the population that has moved to that location. Virtual face image data can later be recovered using the original data by authorized users. The proposed mechanism comprises a face region detection module, a virtual face features generation module, a virtual face feature vector generation module, a virtual face image data generation module, a face features recovery module, a recovered face feature vector generation module and a face image data recovery module. Figure 5 lists the sequences in which the mechanism is implemented. Table 1 shows the abbreviations used in the proposed mechanism.
Symmetry 2020, 12, x FOR PEER REVIEW 5 of 15 identification of sex, age, race and such other public information. Such published information may then be used in statistics of the population that has moved to that location. Virtual face image data can later be recovered using the original data by authorized users. The proposed mechanism comprises a face region detection module, a virtual face features generation module, a virtual face feature vector generation module, a virtual face image data generation module, a face features recovery module, a recovered face feature vector generation module and a face image data recovery module. Figure 5 lists the sequences in which the mechanism is implemented. Table 1 shows the abbreviations used in the proposed mechanism.

Face Region Detection Module
This module detects a face region(s) found in the video data recorded by more than one camera. In response to the sequences in which more than one set of visual data are recorded, the module allocates an order or sequence of recording image data sets (more than one) according to the cameras used. In the case of no face regions detected in the visual data, the module may not allocate the recording order or sequence.

Virtual Face Features Generation Module
This module generates virtualization of random numbers for each image data set (more than one), which corresponds to the order/sequence of recording allocated by the face region detection module, and then uses the random numbers to convert the face features (parameters) that indicate the feature coordinates capable of identifying the person with the face information extracted from the face region into the virtual face feature parameters, by adding noise values based on the virtualization random numbers. As the initial values for the random number generation function, the virtual face features generation module uses pre-determined seed values.
Formula (1) is an expression of virtualization random number generation function that aims at generating virtual faces. "P" refers to the random number generation function; "i" is the order/sequence of issuing the random number generation circuit; and "S" is the pre-defined seed value. Using the formula above to generate virtualization random numbers according to the generation orders/sequences, one can use the random number generation function, where the pre-determined seed value and virtualization random number generation range are the same, to generate the random numbers in the generation sequences. This results in the same virtualization random numbers.
If an example is shown for Formula (1), it is assumed that the random number generation range is in the range of 0 to 100, and the generation cycle is two times. It is also assumed that the first and second rounds have an "S" of 0.2 and 0.5, respectively. In the mechanism, the first round randomly generates random numbers in the range of 0 to 100, and multiplies the number by a fixed seed so that random numbers can be generated within the range intended by the user. Accordingly, the random number generated in the first round will be generated within the range desired by the user by multiplying the fixed seed value by 0.2, and the range will be limited to 0 to 20. Thus, in the second round, it will range from 0 to 50.
The virtualization random numbers so obtained will be used to generate noise values by adding the differences between the median values of the pre-defined virtualization random number generation ranges, to the pre-determined similarity values.
Formula (2) calculates the noise values for generating virtual face features (parameters). Here, "N(i)" refers to the noise value for generating a virtual face; "T" the pre-defined similarity value; "P(i)S" the virtualization random numbers; and "R gr " the pre-set virtualization random number generation range. T = 50%, Rgr = 50 Formula (3) with a 50% similarity and a 50 virtualization random number generation range is an example of how to generate the ranges. Calculating noise values to the three recordings will result in three virtualization random numbers generated within the range of 50, according to Formula (1). In this section, Formula (4) is adopted to arbitrarily set the random numbers at 15, 40 and 25.
Formulas (5)-(7) show the process in which noise values are calculated according to the order/sequence of video recording. As shown, the noise values are in inverse proportion to the pre-defined similarity values-with an increased similarity value resulting in a closer rendition of the original data. In reverse, a lower similarity value will lead to the opposite rendition of the original.

Virtual Face Feature Vector and Data Generation Module
This module extracts virtual face feature vectors from the virtual face feature parameters, i.e., two-dimensional array-based coordinates. The virtual face image data generation module searches the virtual face information based on the virtual face feature vectors that are obtained from the vectorized virtual face feature parameters converted by the virtual face features generation module, as well as with the use of principal component analysis (PCA), which analyzes the inter-vector similarities using a covariance matrix in the face information DB; linear discriminant analysis (LDA), which maximizes the proportions of between-class distributions and within-class distributions; and such other techniques. Then the virtual face image data generation module inserts the virtual face information (search result) to substitute the face region in the video data and generate the virtual face image data. Even if the inserted virtual face information has the same level of similarity as the original face information, each recording may insert a different version of virtual face information. In this process, the real face data extracted from the original image is encrypted and recorded ona blockchain, and by implementing a blockchain in a private environment, it minimizes external access and prevents forgery. The images to be synthesized generate an identification key for the image and include the identification key of the image for the additional extracted face area so that if the restoration of the image is required, the images matched using the identifier for the image can be applied and restored.

Face Features Recovery Module
This module adds the recovery random number SDs to the pre-determined similarity values to calculate back the noise values that were used to generate the virtual face feature parameters. Then model deducts the counter-calculated noise values to generate the recovered face feature parameters that were restored from the face feature parameters. The pre-defined seed values are used to calculate the initial values for the recovery random number generation function.
Formula (8) is an expression for the recovery random number generation function, which is to recover the virtual face. Here, "P" refers to the random number generation function; "i" is the issue order/sequence of the random number generation function; and "S" the pre-defined seed values. This formula can calculate the noise values by adding the similarity values that were used during the generation of virtual image data, to the recovery random number SDs. Formula (8) takes advantage of the value of P(i) in the random number function P(i)S, which is carried out for face virtualization. This allows the restore the random number to have the same value as the random number generated in Formula (1). The formula can also generate the recovered face feature parameters by adding the virtual face feature parameters to the noise values.
Symmetry 2020, 12, 891 8 of 15 Formula (9) is a noise calculation expression that aims to counter-calculate the virtual face information. "N'(i)" refers to the noise value for recovering virtual faces, and "T" is the pre-determined similarity value. "P'(i)S" indicates the recovery random number, and "R gr " the pre-set virtualization random number generation range.

Face Image Restore Module
This module vectorizes the recovered face feature parameters that were generated from the face features recovery module, and uses the vectorized values to generate the recovered face feature vectors. After this, based on the recovered face feature vectors (i.e., a result of the vectorization of recovered face feature parameters), this module searches the recovered face information in the face information DB, and then inserts the recovered face information (search result) into the face regions in the virtual face image data to generate the recovered image data. The information necessary for the recovery will be offered only to the authorized administrators, hence the recovery by unauthorized third parties is prevented.

Comparison and Analysis of the Proposed Mechanism and the Existing Techniques
The proposed mechanism detects one or more pieces of face information contained in the recorded image data and replaces the face information with virtual face information so that the objects recorded in the video cannot be specified. The face information of the object(s) recorded is stored in the face information DB. When an authorized user requests access to the information, later on, the image information that has been replaced with virtual face information can be recovered to its original state and sent to the requesting user. The proposed mechanism (face image modification) was compared with four other techniques currently in use, namely, blurring, image mosaic, removal and transformation, and encryption: Blurring is a technique that applies pixel values of the visual data to weighted masks to obfuscate the image information. The technique allows recovery via deblurring but not the identification of the public information regarding the targeted object (2). Figure 6 lists the original image, blurring-applied image, and recovered image. Formula (9) is a noise calculation expression that aims to counter-calculate the virtual face information. "N'(i)" refers to the noise value for recovering virtual faces, and "T" is the predetermined similarity value. "P'(i)S" indicates the recovery random number, and "Rgr" the pre-set virtualization random number generation range.

Face Image Restore Module
This module vectorizes the recovered face feature parameters that were generated from the face features recovery module, and uses the vectorized values to generate the recovered face feature vectors. After this, based on the recovered face feature vectors (i.e., a result of the vectorization of recovered face feature parameters), this module searches the recovered face information in the face information DB, and then inserts the recovered face information (search result) into the face regions in the virtual face image data to generate the recovered image data. The information necessary for the recovery will be offered only to the authorized administrators, hence the recovery by unauthorized third parties is prevented.

Comparison and Analysis of the Proposed Mechanism and the Existing Techniques
The proposed mechanism detects one or more pieces of face information contained in the recorded image data and replaces the face information with virtual face information so that the objects recorded in the video cannot be specified. The face information of the object(s) recorded is stored in the face information DB. When an authorized user requests access to the information, later on, the image information that has been replaced with virtual face information can be recovered to its original state and sent to the requesting user. The proposed mechanism (face image modification) was compared with four other techniques currently in use, namely, blurring, image mosaic, removal and transformation, and encryption: Blurring is a technique that applies pixel values of the visual data to weighted masks to obfuscate the image information. The technique allows recovery via deblurring but not the identification of the public information regarding the targeted object (2). Figure 6 lists the original image, blurringapplied image, and recovered image. Image mosaic sums the pixel values in a particular region(s) in the video data that includes face information upon which de-identification is requested. The technique then obtains the average/mean from the sum and applies it unilaterally to all pixels in the region(s) so the target object(s) cannot be recognized. In the foregoing, the public information of the target(s) is unrecognizable; however, it is being challenged by, for example, a recent Google publication reported on the development and use of an artificial intelligence (AI)-enabled technology to guess images similar to those in the original data and recover them. Figure 7 shows images of mosaic techniques and restored images. In general, it is impossible to restore the mosaic technique if it has been seriously changed, but recently, the restoration technique using artificial intelligence has been unveiled, showing the possibility of restoration, although it is not perfect. Image mosaic sums the pixel values in a particular region(s) in the video data that includes face information upon which de-identification is requested. The technique then obtains the average/mean from the sum and applies it unilaterally to all pixels in the region(s) so the target object(s) cannot be recognized. In the foregoing, the public information of the target(s) is unrecognizable; however, it is being challenged by, for example, a recent Google publication reported on the development and use of an artificial intelligence (AI)-enabled technology to guess images similar to those in the original data and recover them. Figure 7 shows images of mosaic techniques and restored images. In general, it is impossible to restore the mosaic technique if it has been seriously changed, but recently, the restoration technique using artificial intelligence has been unveiled, showing the possibility of restoration, although it is not perfect. Removal and transformation refers to a method that renders the face information unrecognizable by removing or altering the pixel values for a face region(s) contained in the visual data. With the alteration of the visual data, identifying the public information regarding the target(s) is infeasible. Recovering the data is also difficult due to the deletion of the data contained in the visual information. Figure 8 shows the images prior to the application of Removal and Transformation and the images applied. Although it is possible to change the identification information to a completely different image by composing it with another image, the problem is that restoration is impossible because the image value is completely changed. Lastly, encryption uses encryption keys on visual data to convert them into ciphertexts. The original data are offered only after authorized users decrypt the visual data. Depending on the type of encryption, techniques vary-from converting the original information and obfuscating it for preventing identification to modifying visual data units/frames to render visual data identification itself infeasible. These techniques, however, allow recovery, in which case the public information regarding the objects can be recognized. Access to the recovered information or to the decryption keys is granted only to authorized users. Figure 9 shows the encryption process for images. (A) shows the original image, (B) shows the result of encrypting the image, and (C) is the restored image. In the case of such encryption techniques, the original is unidentifiable so as not to be leaked to the outside world, and images can be restored by legitimate users. The process of conducting a similarity test on subjects will be carried out through a confrontation with virtual facial information stored in the DB for virtualization. At this time, each virtual face has a vector value for the face, and in the proposed mechanism, it is replaced by a virtual face with the Removal and transformation refers to a method that renders the face information unrecognizable by removing or altering the pixel values for a face region(s) contained in the visual data. With the alteration of the visual data, identifying the public information regarding the target(s) is infeasible. Recovering the data is also difficult due to the deletion of the data contained in the visual information. Figure 8 shows the images prior to the application of Removal and Transformation and the images applied. Although it is possible to change the identification information to a completely different image by composing it with another image, the problem is that restoration is impossible because the image value is completely changed. Removal and transformation refers to a method that renders the face information unrecognizable by removing or altering the pixel values for a face region(s) contained in the visual data. With the alteration of the visual data, identifying the public information regarding the target(s) is infeasible. Recovering the data is also difficult due to the deletion of the data contained in the visual information. Figure 8 shows the images prior to the application of Removal and Transformation and the images applied. Although it is possible to change the identification information to a completely different image by composing it with another image, the problem is that restoration is impossible because the image value is completely changed. Lastly, encryption uses encryption keys on visual data to convert them into ciphertexts. The original data are offered only after authorized users decrypt the visual data. Depending on the type of encryption, techniques vary-from converting the original information and obfuscating it for preventing identification to modifying visual data units/frames to render visual data identification itself infeasible. These techniques, however, allow recovery, in which case the public information regarding the objects can be recognized. Access to the recovered information or to the decryption keys is granted only to authorized users. Figure 9 shows the encryption process for images. (A) shows the original image, (B) shows the result of encrypting the image, and (C) is the restored image. In the case of such encryption techniques, the original is unidentifiable so as not to be leaked to the outside world, and images can be restored by legitimate users. The process of conducting a similarity test on subjects will be carried out through a confrontation with virtual facial information stored in the DB for virtualization. At this time, each virtual face has a vector value for the face, and in the proposed mechanism, it is replaced by a virtual face with the Lastly, encryption uses encryption keys on visual data to convert them into ciphertexts. The original data are offered only after authorized users decrypt the visual data. Depending on the type of encryption, techniques vary-from converting the original information and obfuscating it for preventing identification to modifying visual data units/frames to render visual data identification itself infeasible. These techniques, however, allow recovery, in which case the public information regarding the objects can be recognized. Access to the recovered information or to the decryption keys is granted only to authorized users. Figure 9 shows the encryption process for images. (A) shows the original image, (B) shows the result of encrypting the image, and (C) is the restored image. In the case of such encryption techniques, the original is unidentifiable so as not to be leaked to the outside world, and images can be restored by legitimate users. Removal and transformation refers to a method that renders the face information unrecognizable by removing or altering the pixel values for a face region(s) contained in the visual data. With the alteration of the visual data, identifying the public information regarding the target(s) is infeasible. Recovering the data is also difficult due to the deletion of the data contained in the visual information. Figure 8 shows the images prior to the application of Removal and Transformation and the images applied. Although it is possible to change the identification information to a completely different image by composing it with another image, the problem is that restoration is impossible because the image value is completely changed. Lastly, encryption uses encryption keys on visual data to convert them into ciphertexts. The original data are offered only after authorized users decrypt the visual data. Depending on the type of encryption, techniques vary-from converting the original information and obfuscating it for preventing identification to modifying visual data units/frames to render visual data identification itself infeasible. These techniques, however, allow recovery, in which case the public information regarding the objects can be recognized. Access to the recovered information or to the decryption keys is granted only to authorized users. Figure 9 shows the encryption process for images. (A) shows the original image, (B) shows the result of encrypting the image, and (C) is the restored image. In the case of such encryption techniques, the original is unidentifiable so as not to be leaked to the outside world, and images can be restored by legitimate users. The process of conducting a similarity test on subjects will be carried out through a confrontation with virtual facial information stored in the DB for virtualization. At this time, each virtual face has a vector value for the face, and in the proposed mechanism, it is replaced by a virtual face with the The process of conducting a similarity test on subjects will be carried out through a confrontation with virtual facial information stored in the DB for virtualization. At this time, each virtual face has a vector value for the face, and in the proposed mechanism, it is replaced by a virtual face with the most similar value by comparing the generated noise value with the vector value based on the extracted face. Figure 10 shows the vector value extracted through the face, the resulting random value, the generated random value and the name of the face with the most similar figure.
Symmetry 2020, 12, x FOR PEER REVIEW 10 of 15 most similar value by comparing the generated noise value with the vector value based on the extracted face. Figure 10 shows the vector value extracted through the face, the resulting random value, the generated random value and the name of the face with the most similar figure.   The proposed mechanism substitutes the face information contained in the visual data with virtual faces. Hence, the approach allows the identification of visual data as well as the distinction of the basic public information regarding objects. An authorized user can have the virtual face image data recovered to its original state. The face information of the objects is stored in the face information DB, which offers the merit of stronger security in that the original face information is difficult to be identified from the converted virtual face image information even when both the face information DB and the virtual face image data are leaked. Figure 12 is computed as shown in Figure 11 in the virtual face list, replacing the face with the original face with the most similar noise value. As such, replacing a face image with a value that is   Figure 10 shows the vector value extracted through the face, the resulting random value, the generated random value and the name of the face with the most similar figure.   The proposed mechanism substitutes the face information contained in the visual data with virtual faces. Hence, the approach allows the identification of visual data as well as the distinction of the basic public information regarding objects. An authorized user can have the virtual face image data recovered to its original state. The face information of the objects is stored in the face information DB, which offers the merit of stronger security in that the original face information is difficult to be identified from the converted virtual face image information even when both the face information DB and the virtual face image data are leaked. Figure 12 is computed as shown in Figure 11 in the virtual face list, replacing the face with the original face with the most similar noise value. As such, replacing a face image with a value that is The proposed mechanism substitutes the face information contained in the visual data with virtual faces. Hence, the approach allows the identification of visual data as well as the distinction of the basic public information regarding objects. An authorized user can have the virtual face image data recovered to its original state. The face information of the objects is stored in the face information DB, which offers the merit of stronger security in that the original face information is difficult to be identified from the converted virtual face image information even when both the face information DB and the virtual face image data are leaked. Figure 12 is computed as shown in Figure 11 in the virtual face list, replacing the face with the original face with the most similar noise value. As such, replacing a face image with a value that is most similar to the noise value extracted on a vector basis can prevent a subject from leaking into the face image and provide basic information for identifying gender, etc.   The experiment was conducted using the following CPU: Intel i3-6100 3.70GHz; random access memory (RAM): 4 GB environment. To verify the effectiveness of the mechanism proposed in the research process, the procedures required in the face virtualization phase and the time required for image virtualization were analyzed.
First, the procedures of face virtualization required four stages: face area detection, vector operation in the detection area, face vector-based image selection, image composition, face area detection, vector operation in the detection area, virtualization vector-based image extraction and image reconstruction, which required a total of eight steps. While traditional image privacy techniques proceed in three stages-face extraction, image processing algorithm operation and the resultant image production-it can be found that relatively many processes are required. The more processing required by the required items compared to the general images require more processes by adding face virtualization steps for the de-identification of the images in the existing procedures of face extraction, image processing and encrypted image generation. Secondly, we analyzed the time required to virtualize images, and in this process, only one person who was able to detect the face area was using the mp4 extension for the experiment. For the equality of time measurements, the process measured the time for frame extraction in images, face area detection, face virtualization and image production using only extracted frames, which typically took three times the time of the images taken. The images used for the experiment were 8 s long and took about 20 s to virtualize the face within the image and produce it. Time beyond the actual image was difficult to graft on the site, but it was conducted after a bitmap reinterpretation of the extracted images in the experimental environment, and the study of a simpler way to process this process is likely to reduce the required time, and it is assumed that the performance of the devices that virtualize the images will also be  Figure 13 is the result of the proposed mechanism, and initially, the face identified as (B) is replaced with the virtual face by using the original image information that has not undergone a separate non-identification process as shown in (A). The information is recorded in a separate database, and when the original image is required, it is displayed as the original image as shown in (C) to show the face recognition process.   Figure 13 is the result of the proposed mechanism, and initially, the face identified as (B) is replaced with the virtual face by using the original image information that has not undergone a separate non-identification process as shown in (A). The information is recorded in a separate database, and when the original image is required, it is displayed as the original image as shown in (C) to show the face recognition process. The experiment was conducted using the following CPU: Intel i3-6100 3.70GHz; random access memory (RAM): 4 GB environment. To verify the effectiveness of the mechanism proposed in the research process, the procedures required in the face virtualization phase and the time required for image virtualization were analyzed.
First, the procedures of face virtualization required four stages: face area detection, vector operation in the detection area, face vector-based image selection, image composition, face area detection, vector operation in the detection area, virtualization vector-based image extraction and image reconstruction, which required a total of eight steps. While traditional image privacy techniques proceed in three stages-face extraction, image processing algorithm operation and the resultant image production-it can be found that relatively many processes are required. The more processing required by the required items compared to the general images require more processes by adding face virtualization steps for the de-identification of the images in the existing procedures of face extraction, image processing and encrypted image generation. Secondly, we analyzed the time required to virtualize images, and in this process, only one person who was able to detect the face area was using the mp4 extension for the experiment. For the equality of time measurements, the process measured the time for frame extraction in images, face area detection, face virtualization and image production using only extracted frames, which typically took three times the time of the images taken. The images used for the experiment were 8 s long and took about 20 s to virtualize the face within the image and produce it. Time beyond the actual image was difficult to graft on the site, but it was conducted after a bitmap reinterpretation of the extracted images in the experimental environment, and the study of a simpler way to process this process is likely to reduce the required time, and it is assumed that the performance of the devices that virtualize the images will also be The experiment was conducted using the following CPU: Intel i3-6100 3.70GHz; random access memory (RAM): 4 GB environment. To verify the effectiveness of the mechanism proposed in the research process, the procedures required in the face virtualization phase and the time required for image virtualization were analyzed.
First, the procedures of face virtualization required four stages: face area detection, vector operation in the detection area, face vector-based image selection, image composition, face area detection, vector operation in the detection area, virtualization vector-based image extraction and image reconstruction, which required a total of eight steps. While traditional image privacy techniques proceed in three stages-face extraction, image processing algorithm operation and the resultant image production-it can be found that relatively many processes are required. The more processing required by the required items compared to the general images require more processes by adding face virtualization steps for the de-identification of the images in the existing procedures of face extraction, image processing and encrypted image generation. Secondly, we analyzed the time required to virtualize images, and in this process, only one person who was able to detect the face area was using the mp4 extension for the experiment. For the equality of time measurements, the process measured the time for frame extraction in images, face area detection, face virtualization and image production using only extracted frames, which typically took three times the time of the images taken. The images used for the experiment were 8 s long and took about 20 s to virtualize the face within the image and produce it. Time beyond the actual image was difficult to graft on the site, but it was conducted after a bitmap reinterpretation of the extracted images in the experimental environment, and the study of a simpler way to process this process is likely to reduce the required time, and it is assumed that the performance of the devices that virtualize the images will also be significantly different. Table 2 shows a comparative analysis of the existing non-identification techniques and the proposed mechanism.

Conclusions
With increases in the number of video surveillance systems (VSSs), the number of per capita exposures to CCTV cameras is also increasing. Moreover, the ongoing advancement in technology is upgrading VSSs to the extent that they can recognize a particular individual or object or the actions of a moving object from the recordings and can understand the situations by sharing recorded information from and between the dispersed systems. Such capabilities are making VSS-enabled control easier. A concern over the violation of privacy of the objects being exposed to VSSs is nonetheless growing. In addition, the act of compromising the integrity of the system through image forgery of VSS causes a big problem. Images whose integrity is compromised cannot prove the justification of the images, and the surveillance of the place that is the purpose of the VSS may not produce sufficient results.
The main section of this study examined the differences between the existing data privacy preservation techniques and the mechanism herein proposed (the "proposed mechanism"). By recording the original image of the virtualized face on a blockchain, it is possible to prove the integrity of the image by lowering the possibility of face forgery. The proposed mechanism is capable of complementing problems shown in the existing data privacy protection methods. The unique feature of the mechanism includes its ability to detect face regions in the recorded visual information and locate the specific vectors in the detected face regions and substitute them with similarly constructed virtual faces. The identification information in the substituted visual data can be changed by adjusting similarity.
The market for VSSs is showing an upward trend each year, and the number of VSS control centers is also showing growth. These growths, however, are not being accompanied by the implementation of data privacy preservation techniques. Even in the cases where such techniques are applied, the original data cannot be inferred, depending on the methods used, or are susceptible to leakage via encryption key exposure. The violation of privacy due to video data leakage can be a serious issue in the coming years. As such, it is imperative that preventive methods be studied.
Author Contributions: J.K. implemented a methodology for interpreting the identification data of images and virtualizing the identification data. J.K. is currently working as a researcher at the Cybersecurity Human Resource Institute, Jeju National University. His current research interest includes Cloud System, Intelligent Video Surveillance System and Internet of Things.; N.P. contributed greatly to the conceptualization of methodology and overall project management and preparation of manuscripts. All authors have read and agreed to the published version of the manuscript. N.P. is currently working as a Professor in the Department of Computer Education, Teachers College, Jeju National University. His current research interest includes Convergence Technology Security, Computer Education, Smart Grid, IoT, Sea Cloud.