1. Introduction
Covert communication channels are designed to protect the relationship between the transmitter and the receiver by hiding the fact that secret communication is taking place in the first place. Such channels can be used, for example, in military communication or in the presence of an authoritarian government. Secure covert communication can be implemented as a combination of cryptography and steganography. Cryptography ensures that the communicated message stays private while steganography is used to hide the fact that there is encrypted communication. However, steganography requires a medium. It requires that there is a channel where innocuous communication is taking place and both the transmitter and the receiver have access to that channel. In addition, the channel needs to be reliable so that the receiver gets the transmitted message with high probability and it has not been manipulated with.
The blockchain technology was first introduced as the underlying mechanism for cryptocurrency in Bitcoin [
1] as an open decentralized method of providing trust. Blockchain is a public distributed ledger implemented as a continuously growing chain of blocks that is designed to provide authenticity of the data without any centralized parties. Its integrity is ensured collaboratively by the majority of the nodes participating in the blockchain network. Therefore, data recorded into a blockchain is inherently resilient to manipulation, since manipulation would require the adversary to control a large fraction of the nodes. Since its introduction, blockchain technology has attracted a lot of interest in various applications, such as smart contracts enabled by Ethereum [
2], Internet of Things [
3], healthcare [
4] and medical data management [
5]. Typically, blockchain networks, such as those underlying cryptocurrencies, are free for anyone to join. This openness together with strong integrity guarantees on the stored data make blockchains an interesting platform for implementing covert communication channels.
There are three main advantages of blockchain compared to other cover mediums: (1) it is anonymous and free to join, meaning that the communication parties have free access, (2) submitted data cannot be altered and, in particular, the integrity guarantees are not provided by any centralized party, but the consensus of the entire network, and (3) published data cannot be removed, meaning that no authority can apply censorship to already published data. Since the blockchain is immutable, alteration of the covert messages is virtually impossible and the embedding of covert information is free to be fragile. This is not the case, for example, for images on an image board.
In this paper, we suggest a method of submitting covert messages through a blockchain considered as a payment platform. Due to the immutable nature of the blockchain, the sender’s ability to embed into it is limited. However, since everybody is free to submit payments into the chain, we apply those payments to convey encrypted messages to a receiver. We start by providing a model for the blockchain, called a simplified ideal blockchain, where irrelevant technical details have been abstracted away. We then devise , a method of embedding and extracting reliably into a blockchain following this model by submitting payments and show that the method is reliable and runs in expected polynomial time excluding the time spent waiting for new blocks to appear into the chain. Based on the provable security of stegosystems, we then formulate a definition of secure covert communication on a blockchain based on the hardness of distinguishing the payload containing payments from random payments. Finally, we prove that our method satisfies this definition. The embedding rate of our method is rather inefficient and it is considered in a simplified model. Therefore, our proposal can be seen more as a proof-of-concept scheme than a practical scheme that is ready to be implemented. However, our proposal is the first to achieve a covert channel over a public blockchain with provable security.
The paper is organized as follows.
Section 2 explains the preliminaries for the rest of the paper such as the utilized cryptographic primitives and steganography. In
Section 3, we describe our model of a simplified ideal blockchain. The suggested method is explained in detail in
Section 4 and its security is studied and proved in
Section 5. Finally,
Section 6 and
Section 7 provide the discussion on future work and the conclusion, respectively.
2. Preliminaries
2.1. Notation
Let
be a probability distribution. A random variable
V that is distributed according to
is denoted by
. If an element
u is sampled from a probability distribution
, we denote
. The uniform probability distribution on a set
X is denoted by
. For probabilistic algorithms, we apply the standard notation [
16]. When an algorithm
is run with an input
x and it outputs
y, we denote
. Algorithms can be given access to
oracles. Oracles are considered as “black boxes” that can compute, for example, functions or other algorithms. An algorithm
with oracle access to an oracle
is denoted by
. We assume that the time complexity of calling an oracle and receiving its answer is constant time
.
The length of a string is denoted by . The least significant bit (LSB) of is . The binary string of length s consisting solely of ones is denoted by and is often used as a security parameter for cryptographic primitives. A function is negligible if for every there such that for every .
2.2. Cryptographics Primitives
Our method is based on the perceived randomness of the outputs of a hash function. Let denote the set of all finite-length strings over together with the empty string (Kleene closure). We follow the random oracle model where a cryptographic hash function is modeled as a random oracle. For each input string , the output is independently and uniformly randomly selected (and fixed for subsequent queries with the same input) from .
A symmetric encryption scheme , where is a key generation algorithm that on input a security parameter outputs a secret key , is an encryption algorithm that on input a key k and a plaintext message m outputs a ciphertext and is a decryption algorithm that on input a key k and a ciphertext message c outputs a plaintext message m such that .
Algorithm 1 Pseudorandom Ciphertext Experiment |
1: | procedure () | |
2: | | |
3: | | ▹ S is internal state information of |
4: | | |
5: | if then | |
6: | | ▹ Use the actual encryption |
7: | else | |
8: | | ▹ Use a random string |
9: | end if | |
10: | | |
11: | if then | |
12: | return 1 | ▹ guessed correctly |
13: | else | |
14: | return 0 | ▹ did not guess correctly |
15: | end if | |
16: | end procedure | |
A symmetric encryption scheme has pseudorandom ciphertexts under a chosen plaintext attack if a probabilistic polynomial time adversary is unable to distinguish the ciphertext message from a uniformly random string even if he was able to choose the plaintext message. Formally, a polynomial time adversary is run in two stages and and is given oracle access to , the encryption algorithm under a random secret key k. First, , is run with oracle access to and it outputs a cleartext message m together with its internal state information S that it may need in the second phase. Then, based on a coin toss, the second phase is run with either the correct encryption of the chosen plaintext message or a completely random string from the ciphertext space (together with the state information). This pseudorandom ciphertext experiment is defined in Algorithm 1.
The encryption scheme
has pseudorandom ciphertexts under a chosen plaintext attack if the success
advantage,
is negligible for every probabilistic polynomial time adversary
. For an example of a cryptosystem with pseudorandom ciphertexts see for example [
17].
A digital signature scheme is a public key authentication mechanism that enables a user to sign messages so that anybody can later verify the authenticity of that signature. A digital signature scheme
consists of three algorithms such that
on input the security parameter
outputs a private and public key pair
,
outputs a digital signature
of a message
m computed using the private key
and
verifies the signature
by outputting
if and only if
is a valid signature for
m. A digital signature scheme should be
existentially unforgeable meaning that no adversary should be able to generate valid signatures for any message
m without the private key
. For details, see for example [
18].
2.3. Steganography
We follow the terminology of Hopper et al. [
19], where Alice tries to convey a hidden message to Bob. The communication
channel from Alice to Bob is modeled as a probability distribution on bit sequences. Each of the communication bits is augmented with a monotonically increasing timestamp. Communication on the channel is viewed as sampling from this distribution, possibly adaptively bit-by-bit. That is, a channel distribution
produces elements of the form
where
and
for every
. Let
denote a
channel history of bits drawn from the channel. We denote by
the channel distribution conditioned on the history
of already sampled bits together with their timestamps.
Definition 1. A stegosystem with security parameter s on a channel consists of two probabilistic algorithms such that
- 1.
The embedding algorithm takes as input a secret key , a hiddentext message and a channel history and returns a stegotext message .
- 2.
The extraction algorithm takes as input a secret key and a stegotext message and outputs a hiddentext message .
In our investigation, the channel history that is given as input to and the stegotext c given as input to will be replaced by access to the blockchain. That is, in our investigation, the blockchain represents the channel history.
Definition 2. The reliability of a stegosystem with messages of length n is the probability To be useful, we require that the stegosystem has high reliability for hiddentext messages of some length. That is, there is such that for hiddentext messages of length n, the reliability is reasonably high (for example 2/3) meaning that messages can be reliably embedded and extracted.
We also need to stegosystem to be
secure. The security is modeled using a
chosen hiddentext attack by giving an adversary
access to the channel
through a sampling oracle
M and to an additional oracle which is either
that outputs stegotexts embedded by the stegosystem
under a random key
k or an oracle
O that outputs random elements of
for the current history
each with probability 1/2 [
20]. The adversary is free to choose the hidden message
m and needs to distinguish which of these sets of two oracles it was given. If the advantage of distinguishing between these cases,
where
s is the security parameter, is negligible for any adversary, then the stegosystem is secure. For details, see [
20].
3. A Simplified Blockchain Model
We shall start by describing our model for the blockchain. In order to clarify our disposition, we apply a simplified model that abstracts away technical details that are irrelevant for our investigation. For example, in practice, consensus can be reached with several different mechanisms, such as proof-of-work or proof-of-stake. However, for our method, we do not need to know the working details of the implementation as long as immutability is guaranteed. Our model is similar to the idealized public ledger model from [
21]. The
simplified ideal blockchain consists of a chain of blocks
C, an existentially unforgeable [
18] digital signature scheme
, a cryptographic hash function
H that is
modeled as a random oracle and two oracles
and
that can be used to read and write data to the blockchain. We shall describe these components in detail below.
For privacy reasons, payment addresses should be used only once for each payment. While it is technically possible, address re-use is generally considered bad practice since it may lead to identity exposure (For Bitcoin, see for example (
https://en.bitcoin.it/wiki/Address_reuse)). Typically, the pseudonyms (that is, the addresses) in a blockchain are made to look like random. For example, in Bitcoin and Ethereum the address is computed by hashing the public key. In the random oracle model, the hash function outputs random strings. That is, for any public key
, the corresponding address
is a random string.
The simplified ideal blockchain contains
Payee identities. In order to join the systems, an individual i generates a private and public key pair using a digital signature scheme . When paying, individuals are represented by their public keys .
Recipient addresses. Payments to an individual i are sent to a one-time address that the recipient can publish after hashing her public key: . Each payment address appears only once in a payment and a new private and public key pair is generated to receive additional payments.
Money. The blockchain records the amount of money that is associated to an address and, consequently, to a public key .
Initial status of the blockchain. The initial total amount of money is distributed among a finite set of
L individuals represented by their addresses
. An amount of money
is associated to each of these addresses and the initial status of the blockchain
is public knowledge and the first block of the chain.
Payments. Let
and
be two distinct public keys. A
valid payment P of amount μ from a user
i to a user
j is a tuple
where
is the address of user
j,
is the amount of payment such that
,
t is a unique identifier or a timestamp for the payment and
is a digital signature,
Once the payment is published in the blockchain,
and
are (implicitly) updated accordingly.
The ideal chain of blocks. For the simplified ideal blockchain, all published payments are valid and are published in a tamper-proof chain
C of payment blocks
where
is the initial status of the blockchain and the block
consists of all of the valid payments submitted to the blockchain after the publication of block
. In the ideal system, a new block appears after a finite and fixed amount of time.
Note that users are free to join the system by generating their own private and public key pairs . Any address generated from a public key appearing as the recipient of a payment is valid even if it has never appeared in the blockchain before. In practice, addresses are substrings of the digest in order to make then of a manageable length. However, for simplicity, we use the whole digest. For our proofs to work, it is imperative that a unique address is generated for every payment.
To enable access to the blockchain, in our model the users are given oracle access to oracles and which can be used to read the contents of the blockchain block by block and to submit new payments. These are defined as
on input the block number outputs the block from the chain if it has appeared already. If the last block that has appeared is where , outputs ⊥ indicating error.
on input a payment verifies that the payment is valid and saves it to be published in the next block to appear. If the payment is invalid, it is discarded.
4. The Suggested Scheme
In this section, we give a detailed description of our method called (Blockchain Covert Channel). We first give a general overview of the scheme. Then we describe the embedding and extraction procedures in detail. Finally, we show that embedding runs in expected polynomial time (exluding the time spent waiting for new blocks to appear on the chain) and that the method is reliable.
4.1. General Overview of
In our scenario, Alice attempts to convey a hidden message to Bob through the blockchain, while the adversary attempts to detect any such communication. Contrary to traditional steganography, Alice has no ability to alter the “covertexts” (that is, the blocks) stored in the blockchain due to the consensus mechanism. However, Alice has complete control on the payments she submits to the chain. In addition, the consensus mechanism will protect these payments from the possible modification attempts of the adversary. We will apply these payments and, in particular, the payment addresses to convey a hidden message to Bob. For simplicity, we shall send one bit for each block appearing in the chain. The general overview of the scheme is the following:
Alice generates a number of private and public key pairs
and generates the payment addresses
corresponding to these keys.
Alice generates payments (of small amounts) from her own account to these addresses and, depending on the hiddentext message m, orders them such that the least significant bits (LSBs) of the payment addresses form m.
Alice submits the payments in the correct order to the blockchain.
Bob reads the blockchain for payments made by Alice and reads the hiddentext message from the LSBs of the payment addresses.
Note that Alice does not lose any money by running the scheme excluding the possible transaction costs, since she controls the generated key pairs
. While the transaction costs for certain blockchain implementations using the proof-of-work paradigm may be significant, there are also blockchains with other consensus mechanisms that do not require transaction costs. The scheme has been depicted in a simplified form in
Figure 1.
4.2. Embedding Into the Blockchain
In this section, we give a detailed description of the embedding algorithm. For simplicity, we assume that Alice sends exactly one payment to the blockchain for each published block. This means that we are able to embed at most one bit for each published block. It would be possible to embed multiple bits, but it would complicate the formulation of the method. The embedding of a single bit simplifies our disposition and makes it more clear for the reader. For the same reason, we assume that the length of the hidden text is fixed and known to both Alice and Bob. We would not need to make such an assumption, but it greatly simplifies the description of the algorithms, as well as makes the discussion more clear.
Let
be a simplified ideal blockchain, where
and let the security parameter employed by the blockchain implementation be
. Let
. Let also
be a symmetric encryption scheme that has pseudorandom ciphertexts under the chosen plaintext attack and suppose that the security parameter
is used for key generation. We note that it is crucial that an encryption scheme with pseudorandom ciphertexts is used. The argument of the security later in
Section 5 is based the pseudorandomness of the ciphertexts. It should be noted that the utility of such encryption schemes in the construction of steganographic algorithms has been already observed in the literature [
17,
22].
Algorithm 2 Embedding Algorithm |
1: | procedure() |
2: | |
3: | Concatenate |
4: | Set |
5: | Interpret as a bit representation |
6: | |
7: | while do |
8: | Generate unseen |
9: | |
10: | Interpret a as a bit representation |
11: | if then |
12: | |
13: | Generate a unique identifier t for the payment |
14: | |
15: | |
16: | Wait for the blockchain to publish a new block |
17: | Update |
18: | |
19: | end if |
20: | end while |
21: | end procedure |
In the following, let the private and public key pair of Alice be . Let denote the history of payments Alice has made through the blockchain. We denote by the probability distribution on the amount of money in a payment of Alice conditioned on the history . It is important that the payments are made according to this distribution to prevent the adversary from detecting the communication. For any consequent payments, we assume that is updated with the most recent payment and is the resulting probability distribution on the amount of money. For clarity, we also assume that Alice does not run out of money.
On input a security parameter , the key generation algorithm outputs a secret key of the form , where is a uniformly random message start indicator such that that will enable Bob to detect the start of the hidden message and k is an encryption key . The length of the message start indicator will play a role in the consideration of the reliability of the extraction. We assume that the concatenated length of the message start indicator together with the ciphertext are known to both Alice and Bob and let us denote this length by , where is the length of the ciphertext c. Embedding is described in Algorithm 2.
Note that since the algorithm waits for a new block to be published after the submission of a payment, only a single payment from is ever included into a single block. This will help Bob to extract the message.
4.3. Extraction
Extraction is straightforward. Bob will read the blocks from the chain and scans for any transactions made by Alice. Once Bob detects the secret message start indicator , he can read the encrypted hidden message. Since there is a single payment from for each block, the message can be gathered in the correct order. Extraction is described in Algorithm 3.
Algorithm 3 Extraction Algorithm |
1: | procedure() |
2: | |
3: | |
4: | while have not found yet do ▹ Scan for |
5: | |
6: | if then |
7: | Wait until a block appears and read it: |
8: | end if |
9: | for any payment do |
10: | if P is from then |
11: | Extract address a from P and get the LSB |
12: | Scan if we have found the entire |
13: | end if |
14: | end for |
15: | |
16: | end while |
17: | ▹ Now reading the encrypted hidden message |
18: | while do |
19: | |
20: | if then |
21: | Wait until a block appears and read it: |
22: | end if |
23: | for any payment do |
24: | if P is from then |
25: | Extract address a from P and get the LSB |
26: | |
27: | |
28: | end if |
29: | end for |
30: | |
31: | end while |
32: | Compile |
33: | |
34: | output m |
35: | end procedure |
The variables used in
,
and the following proofs have been collected into
Table 1 for easy reference.
4.4. Computational Complexity and Correctness
We shall now show that the embedding can be done in expected polynomial time excluding the time spent waiting for new blocks to appear. That is, we consider here the computational complexity theoretic notion of “time”, which is the number of computational steps needed for the algorithm to finish its task. Then, we show that the method is correct. That is, for a fixed and for any payload , Alice can embed it into the blockchain and Bob is able to extract it with high reliability.
Let us first establish the computational complexity of the embedding algorithm.
Proposition 1. Suppose that and run in for any payment history . Let be the computational complexity of the encryption algorithm , be the complexity of generating a unique identifier for a payment, be the complexity of the digital signature key generation , be the complexity of computing the hash of a public key and be the complexity of generating a signature for a payment. runs in expected time ofwhere N is the number of embedded bits and is the length of the message start indicator.
Proof. Let be arbitrary. First, Alice runs a single encryption of m which takes steps. The while-loop is repeated for until i reaches the total number of embedded bits N. Now, i is updated whenever the LSB of is equal to . Suppose that the while-loop ends in a total of M iterations. Let denote the random variable on such that whenever obtained from the call to H in the v-th while-loop iteration equals and otherwise. Since H is a random oracle, are independent and distributed according to the Bernoulli distribution with probability 1/2.
Let now I denote the random variable corresponding to i. We have , where M is the number of calls we had to make to H. That is, I is distributed according to the binomial distribution that has the expected value of . Since we need to match N values, we expect the while-loop to run times. In each of the loop iterations, we run . Finally, lines 12–18 are repeated exactly N times. ☐
Note that has to wait for a new block to be generated by the blockchain for each of the bits. Therefore, the actual time spent embedding depends also on the blockchain implementation. Even though the computational complexity of embedding is low (virtually close to the complexity of the applied encryption algorithm), for certain blockchains with inefficient block generation the actual time spent embedding can be long. For example, if the block generation takes time T which is significantly greater than the time needed to encrypt m, the time to embed N bits takes time since the majority of the time is spent waiting for new blocks. The same is naturally true for extraction.
Next, we prove that is reliable provided that the length of the message start indicator is large enough.
Proposition 2. on a simplified ideal blockchain with a suitably long message start indicator λ is correct and reliable. That is, for messages of any length, Bob receives Alice’s message with the reliability of at leastwhere is the total number of payments Alice has submitted into the blockchain and is the length of the message start indicator λ.
Proof. By our assumptions, the simplified ideal blockchain is tamper-proof meaning that the adversary is unable to prevent Alice’s submissions from appearing on the chain. Therefore, we may assume that Bob receives every block submitted by Alice. For simplicity, let us assume that all of the message has already appeared on the chain if it has been sent.
Suppose that Alice transmitted m. Let us first show that the scheme is correct, whenever Bob has detected the correct message start indicator . We shall later show that this happens with high probability. By the description of , Alice submits the ciphertext c directly after by submitting a single bit for each new block. Since the blockchain is tamper-proof, Bob receives all of these blocks and, by the description of , extracts the correct c. Finally, Bob decrypts c to obtain the correct hidden message m.
Let us now show that the method is reliable. That is, let us show that Bob is able to detect the correct message start indicator with high probability. By the description of , Bob first scans the blockchain for all of the payments from , extracts the LSB of each address a and scans for the appearance of . There are two ways that the extraction can fail:
The LSBs of the addresses of the payments Alice has made before transmitting accidentally form that Bob misinterprets as the start of the hidden message.
Alice has not submitted any message, but the LSBs of the addresses of her payments form .
These two cases are similar, but provided that Alice has submitted the same number of payments in both cases, in the latter the reliability is lower. To see this, we observe that in the first case, we are trying to find a false match for in the subchain of blocks that appeared before the true match for , while in the latter case, we have the whole blockchain to search for the false match. Therefore, we can restrict ourselves to the second case. By our assumptions, Alice submits exactly one payment for each published block. We now have to derive a upper bound on the probability of these payments forming . Now, H is a random oracle and both and the addresses are sampled from the uniform distribution.
Let
be the string of LSBs of the addresses extacted (in order) from all of Alice’s payments recorded into the blockchain (thus far). Let
denote the random variables corresponding to the choise of
each chosen independently and uniformly at random from
. We derive an upper bound for the probability of hitting
,
We observe that
can appear in any starting position
in which case we have
. Therefore, we estimate
with sum of the probabilities
for
. This sum over-counts, since for large
,
can appear in multiple positions. However, we are only interested in deriving an upper bound. We have,
Therefore, the reliability of
is at least
☐
Interestingly, the reliability depends essentially only on the length of the message start indicator. Note that we have assumed that Alice has submitted a single bit for each block. If multiple bits are embedded into a single block, Proposition 2 needs to be updated accordingly. Finally, the result also assumes that addresses are not reused.
5. Security
In this section, we consider the security of
. Following the notions of security for a stegosystem [
20], we derive a security definition for the blockchain based covert channel by modeling a chosen hiddentext attack of a probabilistic polynomial time adversary on
. We start by listing our assumptions. We then proceed to the formulation of a security definition called
payment indistinguishability that requires the adversary to distinguish the payload containing payments from random. Finally, we show that
satisfies this definition.
5.1. Assumptions
Our security proofs are based on a simplified ideal blockchain . In particular, we assume that digital signatures are existentially unforgeable and the applied cryptographic hash function is modeled as a random oracle. There are three participants in our scenario:
Alice represents the transmitter of our scheme and is known through her (payee identity) public key . She has agreed with the recipient Bob beforehand on a secret key that is not known to anyone else. She attempts to send a confidential message m to Bob through the simplified ideal blockchain . Both Alice and Bob know the total amount of embedded bits N. Finally, Alice is aware of her “normal” distribution of payment amounts given her history of payments and is able to sample from it.
Bob represents the recipient of our scheme. He expects a confidential message from Alice through the blockchain, knows the public key of Alice, as well as the secret key and the total number of embedded bits N.
The adverary attempts to detect the presence of covert communication on the blockchain. We assume that the adversary knows Alice and her public key . The warden also has complete access to the blockchain through the oracles and . The job of the adversary is to distinguish the secret communication payments from regular payments.
5.2. Payment Indistinguishability
We apply a computational indistinguishability based approach for the security definition. In particular, we define a scenario, where the adversary has to distinguish the payments containing the hidden message from a set of random payments. To formalize this into a rigorous security definition, we can apply a chosen hiddentext attack described in
Section 2.3. We model the situation by defining the following
payment distinguishing experiment against a stegosystem
in which the adversary
attempts to distinguish between the scenarios where it is either given a set of randomly generated addresses or a set of addresses containing the conceiled message each with probability 1/2.
We give the adversary full control to choose the hiddentext message m and to observe the blockchain. However, we do not give the adversary the power to block Alice from sending payments to the blockchain or to prevent Bob from observing the blockchain. We also do not give him access to the private keys generated by Alice or to the secret key shared between Alice and Bob. Note that since the digital signature scheme is unforgeable, this means that the adversary cannot masquerade as Alice and forge messages into the blockchain.
The probabilistic polynomial time adversary is modeled in two stages and . In the first stage, it is given access to the full block history of the blockchain through the oracle , the ability to submit payments through the oracle , as well as the public key of Alice. In the first stage, the job of the adversary is to output a hiddentext message m that Alice is required to send to Bob. In the experiment, a coin is then tossed and, based on the outcome, either the stegosystem is applied to send m into the blockchain or a set of random valid payments from Alice is generated. In the second stage, is invoked with oracle access to the blockchain and it eventually outputs a bit trying to distinguish whether the blockchain contained the hidden message or just a set of random valid payments. As with the pseudorandom ciphetext experiment, a string storing the internal state information S of the adversary is also passed from to .
The rigorous definition of the payment distinguishing experiment is the following.
Algorithm 4 Payment Distinguishing Experiment |
1: | procedure() |
2: | |
3: | ▹ S is internal state information of the adversary that can be passed to the second stage |
4: | |
5: | |
6: | if then ▹ Actual message is sent to the blockchain |
7: | |
8: | else ▹ Random payments are sent to the blockchain |
9: | |
10: | ▹ is the encryption scheme used by |
11: | Generate N random addresses for |
12: | Simulate to generate payments to |
13: | Submit payments to blockchain one-by-one as does |
14: | end if |
15: | |
16: | if then |
17: | return 1 ▹ guessed correctly |
18: | else |
19: | return 0 ▹ did not guess correctly |
20: | end if |
21: | end procedure |
Definition 3. Let be a blockchain stegosystem based on a simplified ideal blockchain and let be a two stage probabilistic polynomial time adversary. The payment distinguishing experiment is defined by Algorithm 4.
We also define the advantage of an adversary in detecting the hidden message based on the payment distinguishing experiment.
Definition 4. The payment detection advantage of an adversary on a blockchain stegosystem Π on a simplified ideal blockchain is If the payment detection advantage of the adversary is significantly greater that 0, then, in practice, the adverasry is able to detect the conceiled message. For a secure system, we want this advantage to be negligible.
Definition 5. The blockchain stegosystem Π securely embeds into the blockchain if for every probabilistic polynomial time adversary , there is a negligible function ϵ such thatfor every .
5.3. Security Proof of
We shall now show that securely embeds into the blockchain. In particular, we derive an algorithm that reduces the problem of distinguishing the ciphertexts of the encryption scheme used by to the problem of distinguishing the payments made with . Since is assumed to have pseudorandom ciphertexts, this shows that there is no adversary that succeeds in the payment distinguishing experiment with non-negligible advantage.
Proposition 3. securely embeds into a simplified ideal blockchain . For every probabilistic polynomial time adversary there is a probabilistic polynomial time adversary such thatwhere is the encryption scheme used in and ϵ is a negligible function.
Proof. Let
be any two-stage probabilistic polynomial time algorithm considered as an adversary against
. We need to show that there is a negligible function
such that
Suppose that there was an adversary
that succeeds with a non-negligible advantage. Based on such an adversary, we shall construct a probabilistic polynomial time algorithm
that applies
to achieve a high ciphertext distinguishing advantage for the symmetric encryption scheme
. In particular, we show that the ciphertext distinguishing advantage is at least the advantage of
in the payment distinguishing experiment. Since, by the assumptions,
has pseudorandom ciphertexts (see
Section 4.2), and thus the ciphertext distinguishing advantage is negligible for every adversary, we get the claim.
For this, let be a two-stage adversary, that applies , given below. In the description, we need to save the status information S that outputs in order to invoke in the later state. In addition, since we are emulating a payment distinguishing experiment, we also need to initialize a blockchain and to maintain its state. Therefore, we store the state and internal information of the blockchain into an information string for a coherent second stage .
The adversary
is described in Algorithms 5 and 6.
Algorithm 5 First Stage of the Adversary |
1: | procedure() |
2: | Initialize a blockchain |
3: | |
4: | ▹ Answers the queries according to the specification of |
5: | state and internal information of |
6: | output |
7: | end procedure |
Algorithm 6 Second Stage of the Adversary |
1: | procedure() |
2: | Initialize a blockchain according to the state |
3: | |
4: | Embed into by simulating |
5: | |
6: | output |
7: | end procedure |
If is probabilistic polynomial time, so is . Suppose that was run in an experiment. Let D denote the random variable corresponding to the experiment coin toss (b of line 4) such that if was given the correct under a random key k and if it was given a random . Depending on D, we have the following two cases:
Suppose first that
and
was given the correct
. Then
embeds
into the blockchain which follows the payment distinguishing experiment for
for the case
. Since
outputs the same bit
as
we have
Suppose now that
and
c is a uniformly random string. By the description of
,
is also uniformly random, meaning that a uniformly random string
gets embedded into the blockchain. By the description of the payment distinguishing experiment, this is equal to the case
and
We have established that
, which we constructed to be an adversary to distinguish the ciphertexts of
from random, succeeds in its experiment if and only if
succeeds in its own payment distinguishing experiment. Therefore,
By the definition of advantage,
By the definition of
(see
Section 4.2), the applied symmetric encryption scheme
has pseudorandom ciphertexts under a chosen plaintext attack, and there is a negligible function
such that
for every
(see
Section 2.2). Since
was any two-stage probabilistic polynomial time adversary and
we have the claim and
securely embeds into a simplified ideal blockchain. ☐
6. Discussion and Future Work
To simplify our investigations, we restricted ourselves to the embedding of a single bit for each block. For many applications, such as Bitcoin, the time to publish a new block is counted in minutes instead of seconds. Therefore, the throughput of our scheme is low. However, it is easy to increase the number of embedded bits per block. One way to do it would be to match multiple LSBs of the address. However, in such a case, the expected computation time for embedding grows exponentially in the number of bits, since the bits are drawn from the random oracle. Proposition 2 would also need to be updated accordingly. It is more efficient to submit several payments into the same block. In such a case, the ordering of the message bits have to be ensured, for example, by using the unique payment identifier t or the public payer keys.
Our method requires the user to distribute the message bits over several payments. For many contemporary blockchains that apply the proof-of-work paradign for it consensus mechanism the transaction costs may be significant for larger hidden messages. However, there are newer consensus mechanisms that aim to address the energy usage issue of the proof-of-work paradigm to improve block generation efficiency and to ultimately possibly remove transaction costs in certain applications. Such mechanisms include proof-of-stake, distributed proof-of-stake and byzantine fault tolerance based mechanisms. For example, at the moment, Ethereum is planning on moving to the prove-of-stake paradigm. While there will be high transaction costs in certain use cases of blockchain, we believe that, in many applications, transaction costs will not be an issue and our method will prove to be useful.
Performance of our method can be also increased by pre-computing a list of L addresses, where L is significantly greater than N, the total length of the embedded message. Since the LSBs of these generated addresses are random, approximately one half will be ones and the rest zeros. Once Alice is ready to embed the payload , she can pick the addresses in order from the pre-generated list. This approach would also mitigate the computational complexity of the embedding of multiple bits into a single payment. Furthermore, addresses to embed can be pre-computed immediately after the key has been agreed on. However, we leave these considerations for future work.
It should be noted that it is important that the hidden message is first encrypted using an encryption scheme
that has pseudorandom ciphertexts. If it is not the case, the adversary is able to detect the non-random nature of the LSBs of the addresses in Alice’s payments. For the same reason, the underlying blockchain should not allow the reuse of addresses. In addition to being a privacy risk [
23,
24,
25], it would render the payment indistinguishability approach to the security inapplicable. For such systems, we would need to formulate a security definition that is based on the probability distribution of “normal” payments of Alice and the payload containing payments should be indinstinguishable from this distribution. We leave these considerations also for future work.
In this paper, we have not implemented BLOCCE in practice. Instead, we have considered it in the simplified ideal blockchain model that abstracts away details that are not relevant for the theoretical investigations. For example, the network is completely abstracted away in the simplified model. However, the details, such as the network, are relevant when considering a practical implementation of the scheme. We leave it as future work to investigate the secure implementation of BLOCCE using an existing blockchain such as Ethereum.
7. Conclusions
We suggest the first provably secure method called of implementing a covert communication channel over a blockchain. Our proofs are shown in the random oracle model, where the cryptographic hash function used by the blockchain is modeled as a random oracle. We formulate a simplified ideal blockchain that models the blockchain implementations underlying existing cryptocurrencies. Based on this model, we suggest a method of embedding a single bit for each block using payments submitted to the blockchain. We show that the method is reliable and runs in expected polynomial time. The method can be generalized to embed multiple bits to increase the throughput. To model the security of covert channels on a blockchain, we formulate the notion of payment indistinguishability, where the transmitted hidden message should be computationally indistinguishable from random payments. Finally, we show that satisfies this definition.