The development of deep learning enables the analysis of a massive amount of image data. During these processes, how to analyse the image data while protecting images from leakage and exposure is a big challenge. Traditional access control policies may be invalid when images are stored again in different servers. For example, 4G and incoming 5G techniques enable smart phone users to share their images easily. When an image is uploaded to a service provider (e.g., Facebook), a user can set access privileges to control the access rights for the image so that the image can only be accessed by “friends” or the public. However, when the accumulated images are redistributed to other parties for further analysis (e.g., the Facebook-Cambridge Analytica scandal [1
]), the access control policies which were stored in its original servers are lost. Thus, desired protection solutions should integrate access control policies with the image itself. In other words, even if the image is redistributed, the access control polices will be attached (in-situ) as well. In addition, a simple “yes” or “no” access control on an image does not work well. For example, when taking a photo with a smart phone, additional information such as location data, latitude/longitude, map, date and time, etc. are also included. Various privileges should be attached for that information. Consider geographic or remote sensing images as another example. A geographic image may consist of several areas/layers. Thus, differentiating access control for various areas/layers requires fine-grained and flexible access control policies.
Recently, centrally regulated access control models (e.g., [2
]) have been intensively studied. However, they are not suitable for image data sharing and redistribution for the following reasons: Distributed data can be accessed with two modes: “Yes” for all or “No” for all. For data that cannot be accessed publically, the data cannot be distributed. Once data is distributed, it can be accessed by all accessors. Besides, for those data that must be in access control (classified data), control policies are difficult to define and change, especially when the data volume is large. For example, for different areas in a single image with different access policies, we must set up different regulations in central control servers. Moreover, classified data can be accessed only when remote policy conformance servers are available. The accessibility of the data relies on the availability of networks and the workload of central control servers. It constrains the convenience of remotely accessing data. Furthermore, access control regulation for a large volume of data results in a large delay. Each time accessors request images, they must first fetch access control policies on servers. In big data scenarios, accessing a response on servers results in a large burden and access delay. Finally, once data is distributed, the control domain is changed. Thus, the old management authority may not be available to control the data.
Therefore, with the development of big data sharing and redistribution, traditional access control models based on central conformance should be improved to cater to the new requirements.
In this paper, we design a novel access control model in which access control is conducted by specific clients and access policies are carried together with access objects themselves. Our proposed access control model has the following advantages: Access control policies are attached with image data. Regardless of how many times the data are further redistributed, access control policies are still incorporated with the data. Additionally, access control is fine-grained. For images with large size (e.g., geographic or remote sensing images), control strategies must be specific to different partial areas instead of the entire image. In other words, different parts in one image must conform to different access privileges. Furthermore, accessing classified data does not rely on remote servers or available network connections. The control flow is made more lightweight due to reshaping regulations at clients(we also call it in-situ control).
Based on the above observations and analysis, we propose a new access control model for big image data sharing and redistribution. The major contributions of this paper are listed as follows:
We propose a watermark-based access control model, allowing objects being accessed to integrate together with access control strategies.
We propose a hierarchical key-role-area access control model for images with large size such as geographic graphs and remote sensing graphs. We also propose a hierarchical key generation method that can guarantee fine-grained access privileges.
The rest of the paper is organized as follows: Section 2
surveys related work. Section 3
formulates the research problems and challenges. Section 4
elaborates on the proposed models. Extensive analysis of the proposed scheme is presented in Section 5
, and we conclude the paper in Section 6
2. Related Work
The topic of watermarks has been explored for decades. Due to powerful software and personal computers, there has emerged considerable unauthorized copying and distribution of digital content, such as e-books, videos, and digital images. To solve this problem, watermarks are usually used to verify and protect the copyrights [4
]. In the above methods, both fragile watermarks and robust watermarks are coded as a legal label instead of as a control technique. Additionally, many methods have been proposed to detect the modification of images [7
], but they are unable to find the modifier or prevent such modifications.
In recent years, several watermark schemes have been put forward for access control. Watermarks used for permitting hierarchical access control and protecting the content of visual medical information were proposed [9
]. However, original images are not encrypted in this scheme. A removable and visible watermarking by combining block truncation coding and chaotic map is proposed in [10
], which can be applied in copyright notification and access control in mobile communication. They proposed two-stage watermarks that blur original images before visitors pass access control, and only authorized visitors can attain clear images. However, it is not a hierarchical access control. A. Phadikar proposed a data hiding scheme for access control and error concealment in digital images [11
]. He also proposed a data hiding method that integrates access control and authentication in a single platform, especially for cover images [12
]. Encrypted digital images are displayed in lower quality before watermarks are read. To summarize, the schemes above display images in lower-quality formats before visitors obtain permissions. The access control strategies are still not coded in watermarks.
Quality access control is are used in audio watermarks. K. Datta et al. proposed a combination of both encryption and audio watermarking. This method is used for the safe distribution of audio content over public networks, whereby only authorized users can access the high-quality content, while other users can only access a low-quality content [13
]. Watermarks can be used in video files to identify pirates, which can be extracted at the decoder and used to determine whether the video content is watermarked [14
]. We stress that our proposed scheme for integrating access control policies as watermarks can also be applied in audio files or video files, although we concentrate on images in this paper.
Geologic mapping and the design of geologic (thematic) maps are currently supported by Geographic Information Systems (GIS). In order to gain a high degree of efficiency and to allow the exchange of a common structured framework, map data models have been designed by agencies and individuals in order to support their mapping process. File-based geo-databases are much more accessible, but still suffer from a number of administrative limitations [15
]. A new access control mechanism that combines trust and role-based access control models is presented in [16
]. J. Kim proposes a multi-layer based access control model for GIS mobile web services [17
]. The objective of such spatially-aware access control models is to regulate the access to protect objects based on the position information. M. Kirkpatrick proposed role-based access control with spatial constraints [18
]. F. Ma et al. proposed a fine-grained access control model for spatial data in a grid environment based on a role-based access control model [19
]. Furthermore, a multi-granularity spatial access control model was proposed that introduces more types of policy rule conflicts than single-granularity objects [20
]. The model can manage and enforce the strong and efficient access control technology in large-scale environments. However, all of these access control strategies are not encoded into watermarks, and access control still relies on servers.
In recent years, Quick Response (QR) codes have been popular due to their efficiency and security. They are widely used in mobile phones (e.g., applications of instance messaging, user login, and mobile payment). QR codes can not only store large information, but also have error-correction ability [21
]. In addition, QR codes have high recognition rate, and there are massive algorithm libraries to invoke [22
]. For these reasons, we chose the QR code as a case study for our model.
3. Problem Formulation
3.1. System Model
depicts the traditional access model, which includes four entities: servers, accessors, images, and access control unit. The access control unit is located with servers. Traditional access control processes include four steps, as follows: (1) Accessors request to fetch some data (e.g., images) from servers; (2) Servers inquire access control strategies from the access control unit to determine corresponding accessible objects (e.g., images); (3) The access control unit regulates access privileges as well as accessible objects accordingly; (4) Servers return accessible objects to accessors corresponding to designated privileges.
Once accessors enquire servers for data, servers first have to search access control strategies. According to the access control policies, servers then decide what data can be provided to accessors.
In big data publication scenarios, we move the access control unit to clients, so as to provide persistent control. We change the access control processes as follows: (1) Servers incorporate access control strategies into images as watermarks. (2) Accessors request to fetch some data (e.g., images), and servers publish image big data to accessors. (3) The access control unit in clients parses access control strategies in watermarks to determine access to objects in images. (4) The access control unit regulates access privileges and returns accessible objects to accessors.
depicts our proposed new access control architecture.
Note that embedding methods for access control policies are independent with the above architecture. Watermarks or other associated tags can also be workable if they can reveal access control policies. In most cases, invisible watermarks may be preferred.
Access control policies are embedded with big data, and thus the access control unit is moved to clients for persistent control, regardless of how many times the data are re-distributed. Additionally, access control can be accomplished without assuming the availability of servers and networking connections, which also mitigates the workload of servers and shortens the access delay.
3.2. Attack Models
3.2.1. Transferring Attack
Existing access control models invite the transferring attack. In a transferring attack, if accessor “A” can access image “P”, then accessor “A” can transfer image “P” to others, such as accessor “B”. Thus, accessor “B” can easily gain the access privileges of accessor “A”.
To tackle this attack, we propose the use of a watermark-based access control model where access policies are embedded with objects and move the access control unit from servers to clients.
Besides, transferring attacks cannot be accountable. That is, it is impossible to trace back to original leaking accessors if many accessors can access the same objects. In other words, the provenance of leakage is lost. To provide provenance, we can also rely on watermarks that can reveal the identification of originators or leakers.
For persistent access control, access control policies need to be associated with accessible objects, and the objects can only be accessed upon parsing policies at clients. Additionally, the objects need to return back to unaccessible status after the allotted time of authorized access.
If objects do not retain unaccessible status after being accessed, others can also access those objects when they are transferred to others.
If access control policies are not associated with accessible objects, clients will not be able to enforce access policies. ☐
For the provenance of distributed data, data must carry the identification information of originators.
If data do not carry any of the originators’ identification information, the provenance of who distributes data cannot be determined. ☐
3.2.2. Distributed Denial of Service (DDoS) Attack
Traditional access control models rely on the availability of servers and access control units. The availability can be damaged by distributed denial of service (DDoS) attack. If servers or access control units cannot be accessed, access processes or services will be terminated. It is much easier to let clients be available than servers, thus access control that is migrated to clients will be more scalable and durable.
3.2.3. Coarse Access
In traditional access control models, servers are confronted with a large volume of data and access requests, and fine-grained access control will experience much difficulty due to workload. It is not fine-grained if access control is specific to an entire image, instead of for a specific area or layer in the image—especially for those images that have large size such as geographic graphs or remote sensing graphs. Traditional models may have to tackle fine-grained access by extra control, which further increases the overhead of servers.
3.2.4. Physical Copy Attack
In image big data distribution, the most difficult attack to defend against is physical copy attack, in which images are copied by physical manners such as screen capture or outside photo shooting. After accessors gain access to images, those images are totally displayed and out of (access) control. This attack must be tackled, especially if certain areas or layers in images must remain confidential. It cannot be defended against by access control because it is a kind of proactive defense before events. This attack can be traced back by watermark-based schemes for further provenance, as that is a kind of reactive defense after events.
Physical copy attack cannot be defended against by any access control schemes, but it can be traced back to the source of image leakers, which is called provenance. The provenance can only be achieved by associated watermarks in images.
As images can be uncovered and viewed by authorized accessors, physical copy attack such as screen capture and photo shooting is also possible.
The provenance can be achieved by embedding watermarks in images, as watermarks are also carried by images during and after physical copy attack.
Only when some watermarks associated with the identity of originators are embedded with uncovered images can the provenance of originators who exposed the images be accomplished from leaked images. ☐
3.3. Design Goals
We list design goals as follows: Design a novel access control flow that migrates the control unit from servers to clients. Design a watermark-based access control model that provides fine-grained access control for various areas or layers in a single image. Defend against attacks imposed by traditional access control models and propose a tailored design for big data sharing and redistribution of images with large sizes.
Images can be downloaded only from servers who embed access policies into images via watermarks.
Images can only be viewed via particular client tools, such as an image browser that can extract watermarks, parse watermark semantics into policies, and enforce access control policies before viewing. The context of watermarks can be recognized by corresponding clients.
Accessors may register their roles on servers at first, and their roles can be affirmed by client tools before viewing images.
The client tool can transparently decrypt images by asking for the correct keys. After accessors view their corresponding partial areas, those areas are encrypted again by client tools transparently.
If a hard copy of images is obtained by screen capture or photo shooting, watermarks in images can facilitate the trace back to the accessor who was the last authorized viewer.
4. Proposed Scheme
4.1. Basic Settings
We first describe a concrete process to explain our scheme, which consists of three steps as follows:
Accessors registration. Accessors register for data access on servers. They are assigned a role or multiple roles by servers.
Data publication. Servers who are data publishers or distributors embed access control policies via watermarks in data such as images. Data is published, in which certain areas or layers may be encrypted by secret keys related to control policies.
Client conformance. Accessors request images via particular client tools, such as image browsers. Client tools ask accessors to present their roles and secret keys. Client tools enforce control policies by parsing from watermarks that are embedded in images, and decrypt corresponding areas or layers in images by responding secret keys.
Obviously, data publication and client conformance are critical in the design. Next, we propose a hierarchical encryption model as a concrete scheme.
4.2. Hierarchical Key-Role-Area Access Control Model
The encryption (and decryption) of various areas in a single image can be conducted by the following proposed hierarchical models.
where is a key level, and is a key column. Keys should be classified into different levels. In other words, a key has two metrics: one is key level denoted as l, and the other is key column denoted as c.
where is a set of keys; l is a natural number representing key level. It is a function. It does not need to be not one-to-one. That is, multiple keys may map to one level. It is on-to. We denote the with level l as If multiple keys map to the same level l, we distinguish them as
where is a set of keys; c is a natural number representing key column. It is a function. It does not need to be one-to-one. That is, multiple keys may map to one column index. It is on-to. We denote the with index c as If multiple keys map to the same column c, we distinguish them as
, where and . That is, , . is a one-way function. It is computationally infeasible to obtain x from , where
can be computed from any () by where Similarly, can be computed from by where
Simply speaking, a key with a larger key level can be derived from any key with smaller key levels in the same key column. If accessors possess a key of a smaller level, they can derive all keys with larger key levels in the same key column. Thus, a larger-level key can decrypt the data encrypted by a smaller-level key, but not inversely.
where l is a key level, c is a key column, and u is an identification to distinguish multiple roles for the same key. As multiple roles may map to the same key with , multiple identifications (e.g., u) are required for the distinction of multiple roles.
, where is a set of roles; is a set of keys. It is a function. It does not need to be one-to-one. That is, multiple roles may map to one key. We denote that maps to the same key as is on-to.
Simply speaking, multiple roles may be related to one key. Regarding the privileges for images, the mainly one is “read”. A role with smaller (higher) levels can access all objects that can be accessed by roles with larger (lower) levels. Each role will be mapped to a key.
where is a set of roles; l is a natural number representing a key level. Note that That is, roles are also hierarchically classified into different levels.
where is a set of roles; c is a natural number representing a column number. Note that This function returns a key index (in terms of key column) for a role, which can be used for guaranteeing derivative relationship between keys.
where is a set of roles; u is a natural number representing users who are associated to the same key. Note that if then
The model proposed above is illustrated in Figure 3
Differentiate Areas by Roles
, where l is a key level; c is a column number; u is an identification to distinguish multiple roles for the same key; i is an identification to distinguish multiple areas for the same role. Note that
is a function. It does not need to be one-to-one. That is, multiple areas may be assigned to one role. As r is a tuple with thre elements, a is a tuple with four elements.
is a function. It does not need to be one-to-one. Note that
. Note that
. Note that
. Note that
Note that, can also be replaced by . In geographic images, there may be multiple layers in a single image.
could be any shapes (e.g., circles or rectangles), which are independent of the design of this paper. The details on areas can be embedded in watermarks, such as one-point locations with two rectangular edges. Areas for different roles can be overlapped. For different roles with the same , the areas may be different and one area information for one role may not be available for the other role.
If we remove the constraints of from a function to any mapping, then one role may map to multiple keys.
The proposed access control model is illustrated in Figure 4
4.3. Image Publication
Images can be processed before publication as follows:
Servers select an image to publish. Corresponding areas (e.g., ) in this image are split according to security concerns and assigned to different roles. Areas are layered into different security levels, such that roles who can access higher security level (with larger key level) will be able to access lower security levels (with smaller key level). Servers formulate access control strategies by , where .
Servers code access control strategies into watermarks and embed them into published images. For example, QR codes can be used as watermarks, and strategies are coded into QR codes.
Servers maintain a table for the image , and encrypt specific areas in images with corresponding keys. For example, servers encrypt a by . is a one-way function. instead of is stored for better confidentiality. is initialized by servers in .
in this image, a is encrypted by and note that all are identical.
in an image, we have . Simply speaking, for all areas in one image, encrypt keys must be in the same column index.
in an image, if , then due to .
4.4. Client Conformance
Client conformance for access control can be processed as follows:
Accessors request images via a particular client tool (e.g., image browser).
The browser prompts to ask for and obtain a secret key and a role corresponding to an accessor.
The browser extracts a QR code, obtains access control strategies (i.e., ). All are obtained for . That is,
The browser computes , and decrypts all areas for (i.e., a). Note that the key is not stored in the browser, and only is computed temporarily by the browser and destroyed after browsing.
Calculate all , , and decrypt left areas at lower levels. That is, by .
The browser displays all a to the accessor.
Accessors close the browser, and the browsed image returns to its original encryption status.
Servers will maintain consistency with client tools for function (i.e., the same ). Once the consistency is retained, can be evolved further regularly to provide forward security. Alternatively, an extra pairwise key (e.g., ) between servers and client tools can be introduced into as (e.g., ). We stress that client tools do not locally and permanently store accessor keys. Instead, decryption keys for encrypted areas in images are computed temporally upon browsing.
4.5. Case Study
It is a trend to incorporate multiple maps from one location into one map as multiple layers. For a better explanation, we separate a combined map with multiple layers into three individual maps. In this case study, three maps of Shanghai are displayed in Figure 5
], which includes a remote sensing image, a geologic map, and a city planning map. These three maps describe three aspects of the same location. A combinative map can provide various aspects of one location in one map by multiple layers, which facilitates fast linkages to relevant information within one area.
The security levels of roles and corresponding layers are embedded into maps as watermarks, and thus access control strategies can be obtained from distributed maps without consulting servers. Accessors present their roles to a dedicated client tool such as an image browser, and specific areas that can be accessed by presented roles will be determined by the client tool.
In one map, accessible areas are encrypted by corresponding keys (e.g., is encrypted by ). Only someone who presents the correct key (e.g., ) can view the corresponding encrypted areas (). We also provide a kind of hierarchical access by hierarchical encryptions for areas. That is, keys at lower security levels can be derived by keys at higher security levels (e.g., can be derived by ). Thus, areas for roles in lower security levels can also be decrypted and viewed by roles with higher security levels. Upon request for images by an accessor, the image browser will prompt the accessor to present their key (e.g., ). The image browser will compute and use it to decrypt corresponding areas.
In combinative maps, one area consists of multiple aspects presented in layers. For example, geology, remote sensing, and city planning are three layers of a single city, Shanghai. Some accessors may only be able to access one layer among them. Accessors present their roles and keys to reveal corresponding layers.
5. Security and Performance Analysis
5.1. Security Analysis
Defending Against Transferring Attack. Images are encrypted by designated keys related to corresponding roles or accessor identifications, and accessors must present the correct keys to enable client tools to decrypt images for browsing. Encrypted images cannot be decrypted without keys, even if images are transferred to others again. Moreover, decrypted images can only be decrypted and displayed in client tools. Images will return to their original encrypted status after browsing.
The control unit migrates to client tools and it maintains control even though images are redistributed again. The control policies are associated with images as watermarks, which specify what areas can be viewed for given roles. The decryption can only occur upon browsing, and the encrypted area returns back to confidential status after images are browsed in the client tools. That is, the encrypted areas (layers) are transparently decrypted and ephemerally displayed upon browsing.
Defending Against DDoS Attacks. As access control logics are embedded in watermarks together with images, client tools can control access policies without consulting servers and relying on networking connections. Thus, DDoS attacks for servers and networking connections are not workable.
Defending Against Coarse Access. Our model can differentiate the access privileges for various areas in a single image, and similarly, further access control for various layers in a single area are also possible iteratively.
Defending Physical Copy Attack. As visible watermarks such as QR codes or invisible watermarks are incorporated with images, anyone who obtains physical copies of images by screen capture or outside camera shooting will be traced back by watermarks. The roles and identifications can be revealed by decrypted areas in captured images and control policies in watermarks.
It is hard to compute from if where is a one-way function.
Straightforward. We use a one-way function to drive keys in lower security levels from keys in higher security levels. As the function is one-way, the derivation of keys will be also one-way. That is, it is hard to compute x from . ☐
5.2. Performance Analysis
Computation Cost. The major computation in the scheme are as follows: encoding and decoding watermarks, encrypting and decrypting areas in images, and one-way function computation. However, encoding watermarks can be conducted only one time. Encryption is conducted one time for each image, and decryption is conducted one time for each instance of image browsing. Note that encryption and decryption cannot be avoided for image access control, as some contents must be encrypted for confidentiality. One-way function computation is lightweight (e.g., cryptographically secure hash function).
Higher Access Throughput and Less Access Delay. The access control policies are embedded into watermarks and distributed with images, thus it is not mandatory to consult servers for corresponding areas that can be accessed. This improves the scalability of data access. Besides, the access delay is decreased due to the absence of consulting communications latency between servers and clients.
Efficiency. A balance between servers and clients is preferred, instead of only relying on servers. Servers only need to attach a watermark to an image and encrypt designated areas upon data publication, which can be accomplished in a batch. Client tools only need to decode a watermark and decrypt corresponding areas. The decryption is conducted at the client side, which is much more lightweight than at the server side. The encryption and decryption are mandatory because some areas are confidential.
Convenience. The deployment is convenient. Particular client tools can be deployed as middle-ware over normal image browsers. Besides, communication channels and networks are not required, which brings more convenience for accessors.
QR codes can be used for fast generation and decoding of watermarks. It presents the advantages of large capacity, fault tolerance, easy generation, and fast decoding. Thus, the overhead of attaching and decoding watermarks is manageable.
compares the advantages and disadvantage between ours and existing schemes.
In this paper, we propose a watermark-based access control model. In contrast to current access control methods, we attach access control strategies within accessed objects (e.g., images) as watermarks, instead of storing access control strategies on servers. Our proposed model makes it possible to let accessors view images without accessing servers. This can ease the burden of servers and shorten the access delay. In addition, our model also defends against several dedicated attacks for accessing big image data, such as transferring attack, DDoS attack, coarse access, and physical copy attack. Moreover, we also propose a hierarchical key-role-area access control model. In this model, multiple areas in an image can be mapped to one role, and each role is associated with a hierarchical key. Hierarchical keys are classified into levels and keys at higher security levels can derive keys at lower security levels. Thus, various areas that can be accessed by different roles in one image can be encrypted by hierarchical keys. Because of the above key-role-area model, fine-grained access control can be achieved in a more complicated and customized manner. Especially, the above method can also be applied for different layers in a single image (e.g., geographic maps). Furthermore, further traceability of image leakage (e.g., areas, layers) becomes possible due to embedded watermarks.