Blockchain Technologies for Private Data Management in AmI Environments

Blockchain enables the creation of distributed ledgers as a type of database that is shared, replicated, and synchronized among the members of a network. In this paper we analyze how distributed ledgers can be used for empowering end-users to self-manage their own data, enabling third parties to access those data under a cryptographic management model. We propose a use case where both blockchain and smart contracts are employed by using cryptographic technology to enable user empowerment of data management in AmI. Finally, we analyze strengths and weaknesses of the proposed scenario.


Introduction
In the Big Data era, the amount of data is continuously increasing [1], from user's behavior [2] to other personal data as a new key economical resource [3]. While there are a lot of clear benefits concerning the use of Big Data technologies and the data themselves as a key resource for economy, there is a growing concern about user's privacy and the misuse of user's information [4]. Organizations collect large amounts of personal and sensitive data [5]. Specifically, Ambient Intelligence (AmI) environments require such information for providing context-aware [6] and other personalized services to optimize corporate decisions.
Individuals have little control over the data about them that is stored and how it is used [5]. In recent years, several controversial incidents related to privacy have been discussed on public media. Some of the known examples are government surveillance [7], and large-scale usage of Facebook's data used for influencing public opinion [4], and publishing location data without suitable controls [8].
In this context blockchain technology emerged as one promising technology for enabling a suitable data management using the concept of public ledger updated independently by each participant in the network. In this paper we explore how to combine the building characteristics of blockchain technology to define, create and deploy a user-centric private data management system in AmI environments, providing a data management preliminary solution in Ethereum and an abstract model for authorization of IoT devices and third parties to data. In particular, we contribute to the following objectives: • Definition of mechanism for user empowerment on data management scenarios, by deploying some mechanisms to store data provided by IoT devices in AmI environments, and to enable the end-user to generate and to revoke data usage authorization for others.
• Definition of secure procedures in blockchain for storing and reading data for IoT devices. Data stored on blockchain network will be provided and accessed by IoT devices with the corresponding credentials, to avoid unauthorized usage of such data storage.
The rest of the paper is organized as follows: Section 2 introduces blockchain technology and requirements for private data management in AmI environment; Section 3 provides a security management scenario for AmI environments using blockchain and smart contracts; Section 4 provides an implementation for the data management functionality as a smart contract; Section 5 proposes an abstract security management model and finally Section 6 analyses the proposed solutions and describes future developments.

Blockchain Technology and Data Management in AmI
In this section we analyze two main technologies for data management using blockchain technologies. First, we review the facilities offered by blockchain technologies and finally we go through technologies and requirements related to data management in AmI environments.

Blockchain Technologies
Blockchain emerged as a new technology with the success of cryptocurrencies [9]. A blockchain is a tamper-evident, shared digital ledger that records information (transactions and data) in a public or private peer-to-peer system. Distributed to all member nodes in the system, the ledger records, in an immutable sequential chain of cryptographic hash-linked blocks, the history of asset exchanged that take place between the peers (users, programs or contracts) in the network, preserving the validated list of transactions and the resulting data values associated to these transactions.
All the confirmed and validated transaction blocks are linked and chained from the beginning of the chain to the most current block, using the proof of work as the consensus mechanism for preserving the chain integrity. Blockchain offers several key characteristics that can be used for data management in AmI environment [10]: Data storage: blockchain packs data and transactions in a single structure. A block maintains the record of all transactions occurred across the system. It prevails as an immutable information in the blockchain.
Most relevant benefits of blockchain are fraud protection (prevents the unauthorized changes or malicious tampering, as changes are not possible), easy management (all accepted transactions can be found in the network in a sequential way), ownership (security is maintained by the means of keys and signatures), and lack of mediators (decreasing the settlement times of many transactions and speeding up the process).
Data Distribution: the owner has direct control of the data by using his/her private key. The key enables owner authentication and the owner can provide access rights to the data to whomever he wants.
Most relevant benefits of blockchain regarding data distribution are fault tolerance and attack resistance (decentralized networks are less likely to fail because of their lack of single point of failure), lower transaction costs (by the elimination of intermediaries and overhead costs for exchanging assets) and transparency and auditability (as all transactions are immutable and public).
Data integrity and authenticity: Anyone trying to alter a transaction stored inside a block would need an almost impossible amount of resources (memory, computing power and bandwidth) to alter sufficient number of nodes. Thus, the system cannot be easily corrupted. Guaranteeing the authenticity of data is very important; organizations and individuals have sensitive documents, assets or contracts that must be protected [11]. When data, in any from, is stored, a hash is created for this information piece. This mechanism that works for a single piece of information (data, file, document, etc.) can be extended to a large amount of data using technologies such as the Merkle tree or hash tree.
Smart contracts: Smart contracts [12] are cryptographic "boxes" containing programming logic that is executed if certain conditions are met. They offer more power than Bitcoin scripting because of the added powers of Turing-completeness, value-awareness, blockchain-awareness and state [13]. Smart contracts can function as 'multi-signature' accounts, so that funds are spent only when a required percentage of people agree, and also can store data and provide utility to other contracts (similar to how a software library works).

Data Management in AmI Environments
Data management is a key element in any AmI architecture [14]. A central element regarding data management in these scenarios is privacy, as the right of every person to control access to his/her own personal information [15].
Jøsang et al. [16] explain that the fundamental privacy protection principle is that exposure of personal information should be minimized. Translating this concept to personal data protection, this means that the fewer parties involved in the management of the identity information the better.
Possible weakness in security of a wireless system should be recognized so that the right measures can be taken to improve the user's confidence. AmI systems require data privacy, security, and physical security. A very small carelessness in the AmI security could really have a big impact to everyone involved [16]. Problems like authorization, authentication, and accounting are important while considering the data security. Different devices and standards for communications should be studied properly. Security requirements in AmI systems are studied from the data integrity, authentication and confidentiality point of view.
Data integrity requirement should ensure that the transmitted data from source to destination is unaltered by any means. The data could be intercepted in transit and can be modified [17]. Therefore, data integrity checks (by fingerprinting) should be performed so that the receiver could confirm that the data is not altered.
Data authentication is the process ensuring that the provider is truly the provider of the data [17,18]. The provider should be authenticated so that the attacker pretending to be the sender would not be able to fake the communication.
Data confidentiality is the process of hiding the information so that only the recipients could know of what's being provided by the provider. It can be achieved by using the data encryption algorithms defined in symmetric [16,18] (shared common key) and asymmetric (public-private key pairs) models.

Security Management Scenario for User Data in AmI Environments
In this section we present a security management scenario for user data in AmI Environments. The scenario proposed in Figure 1 integrates several actors: end-users, IoT devices controlled by the user, the third parties willing to access the private information provided by IoT devices, all around the blockchain infrastructure that is going to provide secure access to the data controlled by endusers. In this scenario users manage their own security keys (1), so they can generate suitable credentials, using cryptographic techniques, to be provided to the IoT devices (1.a), and they also are going to generate credentials (1.b) for third parties in order to enable them to access data of the user under strict conditions. Both credentials should be translated to the blockchain infrastructure so they can enforce the application of the rules defined by the end-user regarding data storage and data access. Cryptographic techniques suitable for supporting this are under development. A security model for such proposal is presented in Section 5.
IoT devices, by using security keys, are able to store information into the blockchain (2). Endusers feed the blockchain with suitable information, then the smart contract can apply security restrictions to store and retrieve data from the blockchain, while it provides a full data log. Third parties requesting end-user authorization for using private data can access those data stored in the blockchain (3). In this case blockchain is going to provide access log based on blockchain characteristics.

Implementing Data Management on Blockchain
In this section we present an implementation for the data management functionality in blockchain. We use Ethereum, an open software platform based on blockchain technology that enables developers to build and deploy decentralized applications. Ethereum allows us the implementation of smart contracts, written in Solidity, a contract-oriented programming language.
We provide an implementation of a data management procedure to handle user's private data in Ethereum, by using the smart contract capabilities.
The smart contract of Figure 2 identifies several key elements: The first line (1) of the contract defines the version of Solidity used for coding this smart contract. This defines the name of the smart contract as the name of the class in most object-oriented programming languages.
As one of the attributes of the class we can define data structures (5). In this case we define an array of uint (a short of integer) values for defining the data storage structure.
Another attribute (7) is defined for storing the address of authorized persons for using this contract. In this case for the sake of simplicity only the address of the owner is included in this list of authorized people, although a function can be added to include and remove authorized persons.
The function to generate events to be sent out of the contract is defined (9). Events are the natural way for sending information to users and other external applications.
One specific method (11) is used for checking authorization. In this simple case we checked that the sender of a request is included in the list of authorized users defined at Point 4.
We define the constructor of the smart contract (19). Function in line 23 enables a data reading. The data of the position pos is provided to anybody that requests the data. In this case no filter is applied.
Function in line 28 enables the storage of a data. A new data is stored in the next position of the array data. In this case only users meeting the requirements imposed by authorized function can use this function (in this example only the owner of the contract).
As a summary, this smart contract defines a data storage structure, with a list of authorized addresses (i.e., external users) that can use the functions getData and setData of this contract. An authorized function is defined for evaluating the access right of each invocation to these functions. This simple contract provides some solutions to common problems in AmI environments regarding data management, such as integrity (data storage structure in unaltered), authentication (the sender of a request must be included in the list of authorized users) and confidentiality (smart contract providers and consumers are only recognized by their public key).

An Abstract Model for Authorization of IoT Devices and Third Parties to Access a Smart Contract
In this section we present a general cryptographic model that can be used for generating credentials to enable IoT devices and third parties to store and to read data from the smart contract managed by the end-user. This model can be implemented for generating the credential, providing suitable information to the smart contract to enable them through the modifiers function to filter the access to the data repository.
In the proposed scenario, where keys must be distributed among an undefined (and probably unlimited) number of entities (including devices, third party users and stakeholders), encryption schemes based on symmetric keys have been proved to be highly inefficient and costly. Public Key Infrastructures (PKI), on the other hand, can be the answer to these problems, but the identity assurance of the key owner then becomes a new issue to solve.
Nonetheless, the proposed technique describes a high-level application which should be supported by existing communication technologies, so among all components in the scenario it is supposed a robust, reliable and highly available communication infrastructure. Therefore, problems associated to the private information protection during transmissions and to mechanisms to guarantee the identity of the different entities exchanging keys (which are addressed at link, network and transport level) are assumed to be resolved by the underlying communication infrastructure.
By default, blockchain networks are transparent and all data stored in the smart contracts are accessible by every user in the system. The use of these networks, however, enables all the community sharing information to maintain an agreed and unmodifiable record about the accesses and transactions made over these data. Nevertheless, organizations and owners (usually) need to stablish controls about who can obtain their information or data (systems must be permissioned) [19]. Therefore, incorporating a certain level of privacy in blockchain is required.
In this first work and use case, AmI data are stored in the smart contracts, but in future and advances works this information could be maintained in a database whose access in controlled by a smart contract. In that way, a record about the transactions involving the AmI data will be created, but information could be modified more dynamically and efficiently than in solution using only blockchain technologies. In conclusion, smart contracts only should manage the authorization of users to obtain data, which is the solution to be developed in this section.
In our application scenario, smart contracts, the blockchain network, third-party entities and sensitization devices typically communicate using TCP/IP technologies and REST-like petitions (a REpresentational State Transfer architecture). Thus, for all elements it is possible to stablish a secure communication channel with any other component in the system through protocols such as Hypertext Transport Protocol Secure (HTTPS), which includes solution for both privacy information protection and identity certification.
Therefore, in this section we are only discussing about the "intrusion tolerance" and "active protection" policies at application level. In other words, we are only describing a solution to guarantee that only authorized entities are allowed to communicate (though the described secure channels) with the proposed deployment.
Various special characteristics of blockchain networks and smart contract must be considered: • First, all attributes, parameters and variables in a smart contract are public. In that way, any user, program or device may access to any information about authorized entities that is stored in the smart contract.

•
Besides, sensitization devices are usually resource constraint and cannot perform complex algorithms. On the other hand, devices are also typically unmanaged and geographically sparse, so it is not safe to store information enough to break the security of the system in the ROM memory of these devices (as it could be physically accessed by unauthorized people) [20]. • Finally, in PKI, keys cannot be sent through public networks (even if secure channels are stablished), except if a piece of information is not transmitted to maintain the security of the entire solution.
In general, a final user who deploys a smart contract to manage information in AmI environments, employs an original and key . Then, it is proposed a digital signature scheme (DSS) described by the 3-tuple , , . The first element is a key generator, the second one is the signature function, and the third one is the verification procedure. From this DSS we only require to be homomorphic.
Homomorphic encryption is a special encryption type where applying any algebraic operation on the original information is equivalent to apply also an algebraic operation (not necessary the same) on the encrypted information (1). Many existing DSS schemes and encryptions algorithms are homomorphic (such ElGamal technique [21]).
being ⊛,⊚ algebraic operations. The proposed security model is showed on Figure 3. As can be seen, using the signature generator, the private user key and a n-tuple of additional parameters to increase the generator entropy, it is constructed a user operation key * . This user operation key is a 2-tuple, including (as any PKI) two different keys: a private user operation key * and a public user operation key * (2).
In order to authorize an AmI device to interact with a deployed smart contract, a device key is constructed. This device key is generated using a second generator ℱ that presents vector structure (3). It receives as parameters the private user operation key * and a n-tuple of additional parameters to increase the generator entropy. As result, the generator produces the device key and a piece of information , which is unique for each device, and is called "homomorphic constant".
As in the previous case, the device key is composed of a public device key and a private device key . In this proposal, it is required the private device key, the private user operation key and the homomorphic constant to be related through an equivalence in the digital signature algorithm of message (4). Although this relation may be difficult to fulfil, it must be noted that employed private keys during the digital signature process are different; thus, both signature functions are slightly different and the homomorphic can be obtained and found even if collisionresistant hash functions are used during the signature procedure. The resistance to the birthday attack, any case, is preserved. Any case, the calculation of this constant may be costly, so it is quite impossible to obtain several of them in a row to attack an entire AmI system. * being ⊛ an algebraic operation. At this point, and considering that the selected encryption scheme is homomorphic, it is possible to rewrite the signature operation of message in terms of a new parameter that is unique for each AmI device, ; and that corresponds to the signed homomorphic constant using the private device key (5) being ⊛,⊚ algebraic operations (5) Then, in the smart contract (blockchain network) only two parameters are stored: the public user operation key * , and the parameter for each authorized device. On the other hand, to the authorized device, it is only sent the private device key . In order to the smart contract to accept a data from an AmI device, it must send the corresponding information together with the signature ( ). Then, in the blockchain network, and using the properties of homomorphic encryption, the signature is verified using only data stored in the smart contract (6). Only if the AmI device employs a valid key during the digital signature procedure, the signature could be validated using the public user operation key and parameter . In that way, if signature is verified, the received information is accepted as the remote device is authorized to store data. * * This approach presents several important advantages. First, the smart contract does not store any private key or device key, which would be publicly accessible. Besides, as the authorization procedure is based on the signature of the collected data, pattern recognition techniques or statistical attacks are not successful as hash functions are not invertible. On the other hand, devices do not store a complete key, only the private key of a KPI, so some types of cyberattacks (such as Denial of Service) are also impossible, as nobody can encrypt a message using the private device key which is protected by the user. As keys are generated by the user, strong long-length keys could be employed, as resource constraint devices are not a problem to employ complex key generators.
Algorithm 1 shows the entire authorization mechanism for AmI devices. Finally, we must discuss how to authorize third-party entities to access to the data stored in the smart contract. The same strategy previously presented for AmI devices could be also employed, if desired. However, in this case, as these entities will be remote, keys are not recommended to be directly computed by the user. Instead of that a token-based solution is proposed. The user will produce for each third-party a token using a specific token generator which receives as parameters the public user operation key and a n-tuple of additional parameters to increase the generator entropy (7). The token is sent to the authorized third-party. * , , … , = On the other hand, using the key generator, the third-party creates a new key named as "third-party key" (8). The user receives, then, from the third-party the public third-party key , and stores it (with the corresponding token) in the smart contract.
Then, to access to certain data in the smart contract, the third-party should send (with the data request) the authorization token signed with the private third-party key . If this signature is verified, then, the third-party is authorized to access to data (9).
Algorithm 2 shows the entire authorization mechanism for third-party. As can be seen, in this algorithm only required information to manage privacy and user authorizations is stored in smart contracts. The storage and recuperation of AmI data should be developed in a different algorithm. If data are stored in blockchain networks, they should be encrypted before being written, in order to preserve their privacy.

Algorithm 2
Authorization mechanism for third-party User executes: ← () user waits for the public third-party key user receives the public third-party key user stores , in the smart contract Third-party executes: , ← ()

Conclusions and Future Work
In this paper we analyzed the problems related to data management by end-users in AmI environments and developed a user-centric private data management infrastructure.
Based on the analysis of blockchain, smart contracts and AmI environments, we designed one scenario where end-users can use their own secure keys for deriving cryptographic information that can be translated to both IoT devices and third parties. Using this cryptographic information, which is also managed by the smart contract, the IoT devices and third-parties are able to use the data repository according to explicit permissions granted by the end-users. Those permissions can be changed dynamically, when required, by the end-user.
The proposed contract is only an example of how to provide some solutions to common problems in AmI environments regarding data management, such as integrity (data storage structure in unaltered), authentication (the sender of a request must be included in the list of authorized users) and confidentiality (smart contract providers and consumers are only recognized by their public key).
Although the proposed smart contract fulfills many conditions of our scenario, it offers some relevant drawbacks, based on blockchain's basic design principles: every data stored in the smart contract is publicly readable when it appears in a public transaction, the cost of storing data is very high, and the list of authorized parties is also publicly available. For solving these limitations, we provide a security approach for user authorization so that smart contracts do not store any private key or device key avoiding security problems related to the public availability of the information.
We can conclude from the initial work presented in this paper, that blockchain technology combined with smart contract enable the end-users to own and manage the data generated by IoT devices in AmI environment. However, some problems arise in the provision of a public blockchain infrastructure such as the one provided by Ethereum. The main problem is that public networks do not provide means to effectively restrict read access to data, as all persistent data in a contract is readable to anyone even without the provision of access functions in Solidity. Thus, it is still required to deploy complex cryptographic solutions for avoiding problems of transparency related to blockchain and to link the blockchain with external devices in order to automatically store information in the system. Therefore, external data repositories are required to create hybrid solutions where benefits of blockchain networks (traceability of data transactions) are combined with the efficient and private access and storage of data. Other contributions related to these limitations, such as external storage capabilities and integrity (Merkle tree [22]) and data anonymity (Kademilia [23]) techniques will be studied in future works.
Author Contributions: The contributions described in this work are distributed among the authors in the way that follows: T.R. proposed and developed the paper's idea, B.B. proposed the security model, R.A. contributed to the theoretical formalization and paper redaction, and D.S. implemented smart contract for data management function.
Funding: Borja Bordel has received funding from the Ministry of Education through the FPU program (grant number FPU15/03977). Additionally, the research leading to these results has received funding from the Ministry of Economy and Competitive-ness through SEMOLA project (TEC2015-68284-R) and from the Autonomous Region of Madrid through MOSI-AGIL-CM project (grant P2013/ICE-3019, co-funded by EU Structural Funds FSE and FEDER).

Conflicts of Interest:
The authors declare no conflict of interest.