Function Computation under Privacy, Secrecy, Distortion, and Communication Constraints

The problem of reliable function computation is extended by imposing privacy, secrecy, and storage constraints on a remote source whose noisy measurements are observed by multiple parties. The main additions to the classic function computation problem include (1) privacy leakage to an eavesdropper is measured with respect to the remote source rather than the transmitting terminals’ observed sequences; (2) the information leakage to a fusion center with respect to the remote source is considered a new privacy leakage metric; (3) the function computed is allowed to be a distorted version of the target function, which allows the storage rate to be reduced compared to a reliable function computation scenario, in addition to reducing secrecy and privacy leakages; (4) two transmitting node observations are used to compute a function. Inner and outer bounds on the rate regions are derived for lossless and lossy single-function computation with two transmitting nodes, which recover previous results in the literature. For special cases, including invertible and partially invertible functions, and degraded measurement channels, simplified lossless and lossy rate regions are characterized, and one achievable region is evaluated as an example scenario.


Introduction
We consider function computation scenarios in a network with multiple nodes involved.Each node observes a random sequence and all observed random sequences are modeled to be correlated.Recent advancements in network function virtualization [3] and distributed machine learning applications [4] make function computation in a wireless network via software defined networking an important practical problem that should be tackled to improve the performance of future communication systems.In a classic function computation scenario, the nodes exchange messages through authenticated, noiseless, and public communication links, which results in undesired information leakage about the function computed [5][6][7].Furthermore, it is possible to reduce the amount of public communications [8,9] by using distributed lossless or lossy source coding methods; see [10][11][12][13][14] for several extensions.The former method uses Slepian-Wolf (SW) coding [15] constructions and the latter allows the function computed to be a distorted version of the target function and applies Wyner-Ziv (WZ) coding [16] methods that result in further reductions compared to the former.A decrease in public communication is important also to limit the information about the computed function leaked to an eavesdropper in the same network, i.e., secrecy leakage.In addition to the public messages, an eavesdropper has generally access to a random sequence correlated with other sequences; see [17][18][19] for various secure function computation extensions.
An important addition to the secure function computation model is a privacy constraint that measures the amount of information about the observed sequence leaked to an eavesdropper [20].Providing privacy is necessary to ensure confidentiality of a private sequence that can be reused for future function computations [21,22].An extension of the results in [20] are given in [23], where two privacy constraints are considered on a remote source whose different noisy measurements are observed by multiple nodes in the same network.The extension in [23] is different from the previous secure and private function computation models due to the posit that there exists a remote source that is the main reason for the correlation between the random sequences observed by the nodes in the same network.It is illustrated via practical examples that considering a remote source hinders unexpected decrease in reliability and unnoticed secrecy leakage [22].Similarly, such a remote source model is proposed, e.g., in [24] for biometric secrecy and in [25,26] for user or device authentication problems.It is shown in [23] that with such a remote source model two different privacy leakage rate values should be limited, unlike a single constraint considered in [20].
We consider a private remote source whose three noisy versions are used for secure single-function computation.Suppose two nodes transmit public indices to a fusion center to compute one function.In [23], for each function computation one node sends a public index to a fusion center.In [20], cases with two transmitting nodes for function computation are considered for a visible source model, whose results are improved in this work for a remote source model with an additional privacy leakage constraint.Furthermore, we also consider function computation scenarios where the function computed is allowed to be a distorted version of the target function, which is relevant for various recent function computation applications.

Models for Function Inputs and Outputs
We consider noisy remote source output measurements that are independent and identically distributed (i.i.d.) according to a fixed probability distribution and that are inputs of a target function.This model is reasonable if, e.g., one uses transform-coding algorithms from [27][28][29][30] to extract almost i.i.d.symbols, as applied in the biometric security, physical unclonable function, and image and video coding literature.Furthermore, the set of target functions we study are applied per-letter, i.e., the same function is applied to each input symbol; see Section 2 below.These functions are realistic and are used in various recent applications, such as distributed and federated learning applications where the same loss function is applied to each data example [31].

Summary of Contributions
We extend the lossless and lossy rate region analysis of the single-function computation model with one transmitting node in [23] to consider two transmitting nodes with joint secrecy and privacy constraints, as well as a distortion constraint on the computed function.A summary of the main contributions is as follows.

•
The lossless single-function computation model with two transmitting nodes is considered and an inner bound for the rate region that characterizes the optimal trade-off between secrecy, privacy, storage, and distortion constraints is established by using the output statistics of random binning (OSRB) method [32].An outer bound for the same rate region is also provided by using standard properties of Shannon entropy.Inner and outer bounds are shown to not match in general due to different Markov chains imposed.

•
The proposed inner and outer bounds are extended for the lossy single-function computation model with two transmitting nodes by considering a distortion metric.Furthermore, effects of considering a distortion constraint, rather than a reliability constraint, on the function computation are discussed.

•
For both partially invertible functions, which define a set that is a proper superset of the set of invertible functions, and invertible functions, we establish simplified lossless and lossy rate region bounds.

•
The simplified rate region bounds for invertible functions are further simplified when the eavesdropper's measurement channel is physically degraded with respect to the fusion center's channel or vice versa, which results in different bounds on the rates.

•
We evaluate an achievable rate region for a physically degraded case with multiplicative Bernoulli noise components.

Organization
This paper is organized as follows.In Section 2, we introduce the lossless and lossy single-function computation problems with two transmitting nodes under secrecy, privacy, storage, and reliability or distortion constraints.In Section 3, we present the inner and outer bounds for the rate regions of the introduced problems and discuss that the bounds differ because of different Markov chains imposed.In Section 4, we establish simplified lossless and lossy rate region bounds for invertible functions, partially invertible functions, and two different degraded measurement channels, and an achievable rate region for an example case is evaluated.In Section 5, we offer proofs of the inner and outer bounds for the lossless single-function computations with two transmitting nodes.In Section 6, we conclude the paper.

Notation
Upper case letters represent random variables and lower case letters their realizations.A superscript denotes a sequence of variables, e.g., X n = X 1 , X 2 , . . ., X i , . . ., X n , and a subscript i denotes the position of a variable in a sequence.A random variable X has probability distribution P X .Calligraphic letters such as X denote sets, set sizes are written as |X |.

System Model
We consider the single-function computation model with two transmitting nodes illustrated in Figure 1.Noisy measurements X n 1 and X n 2 of an i.i.d.remote source X n ∼ P n X through memoryless channels P X 1 |X and P X 2 |X , respectively, are observed by two legitimate nodes in a network.Similarly, other noisy measurements Y n and Z n of the same remote source are observed by the fusion center and eavesdropper (Eve), respectively, through another memoryless channel P YZ|X .Encoders Enc 1 (•) and Enc 2 (•) of the legitimate nodes send indices W 1 and W 2 , respectively, to the fusion center over public communication links with storage rate constraints.The fusion center decoder Dec(•) then uses its observed noisy sequence Y n and the public indices W 1 and W 2 to estimate a function The source and measurement alphabets are finite sets.A natural secrecy leakage constraint is to minimize the information leakage about the function output f n ( X n 1 , X n 2 , Y n ) to eavesdropper.However, its analysis depends on the specific function f (•, •, •) computed, so we impose below another secrecy leakage constraint that does not depend on the function used and that provides an upper bound for secrecy leakage for all functions, as considered in [20,23].Furthermore, we impose two privacy leakage constraints to minimize the information leakage about X n to the fusion center and eavesdropper because the same remote source would be measured if another function would be computed in the same network (see also [21] for motivations to consider privacy leakage with respect to a remote source) as well as public storage constraints that minimize the rate of storage for transmitting nodes.We next define lossless and lossy single-function computation rate regions.

Lossless Single-Function Computation
Consider the single-function computation model illustrated in Figure 1.The corresponding lossless rate region is defined as follows.
Definition 1.A lossless tuple (R s , R w,1 , R w,2 , R ,Dec , R ,Eve ) is achievable if, for any δ > 0, there exist n ≥ 1, two encoders, and one decoder such that The lossless region R is the closure of the set of all achievable lossless tuples.♦

Lossy Single-Function Computation
The corresponding lossy rate region for the single-function computation model illustrated in Figure 1 is defined as follows.Definition 2. A lossy tuple (R s , R w,1 , R w,2 , R ,Dec , R ,Eve , D) is achievable if, for any δ > 0, there exist n ≥ 1, two encoders, and one decoder such that (3)-( 7) and where is a per-letter distortion metric.The lossy region R D is the closure of the set of all achievable lossy tuples.♦

Lossless Single-Function Computation
We first extend the notion of admissibility defined in [8] for a single auxiliary random variable to two auxiliary random variables, used in the inner and outer bounds given below for lossless function computation; see also [20,Theorem 3]. and form Markov chains.♦ We next provide inner and outer bounds for the lossless region R; see Section 5 for a proof sketch.Theorem 1. (Inner Bound): An achievable lossless region is the union over all P Q , P where we have (Outer Bound): An outer bound for the lossless region R is the union of the rate tuples in ( 13), ( 16)-( 18), and form Markov chains.One can limit the cardinalities to We remark that if the joint probability distribution in ( 19) is imposed on the outer bound, ( 20) and ( 21) recover ( 14) and ( 15), respectively, because then form Markov chains for (19).However, the outer bound that satisfies ( 22) and ( 23) defines a rate region that is in general larger than the rate region defined by the inner bound that satisfies (19).Thus, inner and outer bounds generally differ.The results in Theorem 1 recovers previous results including [20,Theorem 3] and, naturally, also other results that are recovered by these previous results such as the SW coding region.

Lossy Single-Function Computation
We next provide inner and outer bounds for the lossy region R D ; see below for a proof sketch.

Theorem 2. (Inner Bound): An achievable lossy region is the union over all P
, and P X 2 |U 2 of the rate tuples in ( 13)-( 18) and for some function g(•, •, •) and where (Outer Bound): An outer bound for the lossy region R D is the union over all P Q , P 13), ( 16)-( 18), ( 20), (21), and (26) such that ( 22) and ( 23) form Markov chains.One can limit the cardinalities Proof Sketch.The achievability proof of the lossy function computation problem follows from the achievability proof of its lossless version given in Section 5.1 by replacing the admissibility constraint with the constraint that P , and P V 2 |U 2 are chosen such that there exists a function g(U 1 , U 2 , Y) that satisfies where n > 0 such that n → 0 when n → ∞.Since all ( x n 1 , x n 2 , y n , u n 1 , u n 2 ) tuples are in the jointly typical set with high probability, by the typical average lemma [33, pp. 26], constraint in ( 8) is satisfied.
The proof of the outer bound applies the standard properties of the Shannon entropy and follows mainly from the outer bound proof for the lossless function computation problem given in Section 5.2.However, the proof for the lossless function computation problem requires the auxiliary random variables to be admissible as defined in Definition 3, unlike the lossy function computation problem.Thus, the outer bound proof for Theorem 2 follows by replacing the admissibility step (95) in the outer bound proof for the lossless function computation problem with the step where (a) follows by ( 8) and ( 9), (b) follows since there exists a function g i (•, •, •) that achieves a distortion that is not greater than the distortion achieved by follows from the Markov chain given in (99), and (d) follows from the definitions of U 1,i and U 2,i given in (90) and (91), respectively.Furthermore, the proof of the cardinality bounds for the lossy case follows from the proof for the lossless case since we preserve the same probability and conditional entropy values as being preserved for the lossless function computation problem with the addition of preserving the value of Entirely similar to Theorem 1, the inner and outer bounds given in Theorem 2 do not match in general because of different Markov chains imposed.
Remark 1.Since all secrecy and privacy rate terms given in the outer bounds in Theorems 1 and 2, i.e., lower bounds in ( 13), (17), and (18), are generally strictly positive, strong secrecy or strong privacy constraints cannot be satisfied in general for the lossless and lossy single-function computation problems.
We next provide the simplified rate region bounds for various sets of computed functions f (•, •, •) and measurement channels P YZ|X .

Rate Regions for Special Sets of Computed Functions and Measurement Channels
The terms that characterize the rate region bounds for the lossless and lossy function computation problems for various sets of functions and channels are the same, except (1) removal of the admissibility requirement; (2) addition of a distortion constraint; and (3) increase in the cardinality bounds on the auxiliary random variables for the lossy case as compared to the lossless case.Thus, we provide simplified rate region bounds only for the lossless case.However, we remark that the optimal auxiliary random variables for lossless and lossy cases might differ.Therefore, the corresponding lossless and lossy rate regions might look different for the same joint probability distribution P X 1 X 2 XYZ .

Partially-Invertible Functions
We now impose the condition that the function f ( X 1 , X 2 , Y) is partially-invertible with respect to X 1 , i.e., we have [11,34] For such functions, it is straightforward to show that we have the following achievable rate region for the lossless function computation problem with two transmitting nodes.We remark that the proof of Lemma 1 follows from the inner bound in Theorem 1 by assigning U 1 = X 1 and the corresponding outer bound can be similarly obtained from Theorem 1. Furthermore, by symmetry the lossy rate region bounds for a function f ( X 1 , X 2 , Y) that is partially invertible with respect to X 2 can be obtained by assigning Lemma 1.The lossless region R when f ( X 1 , X 2 , Y) is a partially invertible function with respect to X 1 includes the set of all tuples (R s , R w,1 , R w,2 , R ,Dec , R ,Eve ) such that U 2 is admissible for the function f ( X 1 , X 2 , Y) and for some function (•, •, •) such that (19) follows with U 1 = X 1 .

Invertible Functions
Suppose now we impose the condition that the function f ( X 1 , X 2 , Y) is invertible, i.e., we have [11,34] We provide in Lemma 2 below an achievable rate region for the lossless computation problem with two transmitting nodes when the function f ( X 1 , X 2 , Y) is invertible.The proof of Lemma 2 follows from Theorem 1 by assigning U 1 = X 1 , U 2 = X 2 , and constant V 1 and V 2 .Note that choosing V 1 and V 2 constant results generally in suboptimal rate regions.
Lemma 2. The lossless rate region R when f ( X 1 , X 2 , Y) is an invertible function includes the set of all tuples (R s , R w,1 , R w,2 , R ,Dec , R ,Eve ) satisfying R s ≥ I( X 1 , X 2 ; Z|Q) − I( X 1 , X 2 ; Y|Q) where Q − ( X 1 , X 2 ) − X − (Y, Z) form a Markov chain.One can limit the cardinality to |Q| ≤ 2.

Invertible Functions and Two Different Degraded Channels
The lossless rate region given in Lemma 2 can be further simplified by imposing conditions on the measurement channel P YZ|X in addition to the function f ( X 1 , X 2 , Y) being invertible.We next establish achievable lossless rate regions for two different physically degraded channels.

Eve's Channel is Physically-Degraded
Suppose the measurement channel P YZ|X is physically-degraded such that For invertible functions and physically degraded measurement channels P YZ|X as defined in ( 44), we provide an achievable lossless rate region in Lemma 3. The proof of Lemma 3 follows from Lemma 2 and by using the following Markov chain for this case which follows by (44).
Lemma 3. The lossless rate region R when f ( X 1 , X 2 , Y) is an invertible function and P YZ|X is as given in (44) includes the set of all tuples (R s , R w,1 , R w,2 , R ,Dec , R ,Eve ) satisfying ( 39)-(42) and

Fusion Center's Channel is Physically-Degraded
Suppose the measurement channel P YZ|X is physically-degraded such that For invertible functions and physically degraded measurement channels P YZ|X as defined in ( 48), we provide an achievable lossless rate region in Lemma 4. The proof of Lemma 4 follows from Lemma 2 and by using the following Markov chain for this case which follows by (48).
Lemma 4. The lossless rate region R when f ( X 1 , X 2 , Y) is an invertible function and P YZ|X is as given in (48) includes the set of all tuples (R s , R w,1 , R w,2 , R ,Dec , R ,Eve ) satisfying ( 39)-(42) and Remark 2. The rate regions given in Lemmas 2-4 can be plotted by computing the terms that characterize the regions since P X 1 X 2 XYZ is fixed for function computation problems considered.However, the rate region given in Lemma 1, similar to the inner bounds given in Theorems 1 and 2, might not be easy to characterize due to the requirement to optimize the auxiliary random variables whose cardinalities are bounded by large terms.Thus, evaluating the rate region for a function computation problem with two transmitting terminals is generally significantly more difficult than characterization of the rate region for function computation with one transmitting terminal; see [23] for an information bottleneck example for the latter problem.
We next evaluate an achievable lossless rate region R by using Lemma 4 for specific measurement channels when f ( X 1 , X 2 , Y) is an invertible function.

Lossless Rate Region Example
Suppose measurement channels in Figure 1 have binary input and output alphabets with multiplicative Bernoulli noise components, i.e., we have X where S 1 , S 2 , X, and (S Z , S Y ) are mutually independent, and we have P X (1) = 0.5, P S 1 (1) = β 1 , P S 2 (1) = β 2 , P S Z S Y (0, 0) = (1−q), P S Z S Y (1, 1) = qα, and where the sum-storage rate constraint is active since the sum of the bounds on R w,1 and R w,2 is smaller than the bound on (R w,1 + R w,2 ).

Proof of Theorem 1 5.1. Inner Bound
Proof Sketch.The OSRB method [32] is used for the proof of achievability by applying the steps given in [36, Section 1.6].Let are assigned to each v n 2 and u n 2 , respectively.The indices F 1 = (F v 1 , F u 1 ), and F 2 = (F v 2 , F u 2 ) represent the public choice of two encoders and one decoder, whereas W 1 = (W v 1 , W u 1 ) and W 2 = (W v 2 , W u 2 ) are the public messages sent by the encoders Enc 1 (•) and Enc 2 (•), respectively, to the fusion center.
We consider the following decoding order: , the decoder estimates U n 2 as U n 2 .By swapping indices 1 and 2 in the decoding order another corner point in the achievable rate region is obtained, so we analyze the given decoding order but also provide the results for the other corner point.
Consider Step 1 in the decoding order given above.Using a SW [15] decoder, one can reliably estimate V n 1 from (Y n , F v 1 , W v 1 ) such that the expected value of the error probability taken over the random bin assignments vanishes when n → ∞, if we have [32, Lemma 1] Similarly, Step 2, 3, and 4 estimations are reliable if we have where (a) follows from the Markov chain because then the expected value, which is taken over the random bin assignments, of the variational distance between the joint probability distributions Unif[1: and To satisfy ( 57)-( 64), for any > 0 we fix Public Message (Storage) Rates: (66) and (70) result in a public message (storage) rate R w 1 of 68) and ( 72) result in a storage rate R w 2 of where (a) follows from the Markov chain We remark that if the indices 1 and 2 in the decoding order given above are swapped, the other corner point with is achieved.Privacy Leakage to Decoder: We have where (a) follows for some n > 0 with n → 0 when n → ∞ because since 1) by ( 61) where (a 1 by (57), and similarly because 1 by (59) and the inequality that follows from 2 by (60) and the inequality that can be proved entirely similarly to (81) by using the Markov chain 79), obtaining single letter bounds on the term H(W 1 , W 2 , F 1 , F 2 |Z n ) requires analysis of numerous decodability cases, whereas there are only six different decodability cases analyzed in [23] for secure function computation with a single transmitting node.To simplify our analysis by applying the results in [23], we combine the decoding order Steps 1 and 2 given above such that (V 1 , V 2 ) are treated jointly and, similarly, we combine Steps 3 and 4 such that (U 1 , U 2 ) are treated jointly.Using the combined steps, we can consider the six decodability cases analyzed in [23, Section V-A] by replacing ) can be obtained by applying the same replacement to the second term in [23, Eq. ( 54)], we obtain from (79) and these decodability analyses that for some n > 0 such that n → 0 when n → ∞.Secrecy Leakage (to Eve): We obtain Admissibility of (U 1 , U 2 ): Define such that n → 0 if δ n → 0. Using Fano's inequality and (2), we obtain where (a) follows from [38,Lemma 2] that proves that when n → ∞, there exists an i.i.d.random variable s f n that satisfies both and the Markov chain (b) follows from the data processing inequality because of the Markov chain and permits randomized decoding, (c) follows from the Markov chain and (d) follows from the definitions of U 1,i and U 2,i .Public Message (Storage) Rates: We obtain where (a) follows by ( 4), (b) follows from the Markov chain (c) follows from the data processing inequality applied to the Markov chain (d) follows from the definition of U 1,i , and (e) follows by (92).Similarly, one can show by symmetry that we have Now we consider the sum-rate bound such that where (a) follows by ( 4) and ( 5), (b) follows since ( X n 1 , X n 2 , Y n ) are i.i.d. and because form a Markov chain, (c) follows by applying the data processing inequality to the Markov chain (d) follows from the definitions of U 1,i and U 2,i , (e) follows from the Markov chain and ( f ) follows from the Markov chain Privacy Leakage to Decoder: We have where (a) follows by (6) and from the Markov chain (W 1 , W 2 ) − X n − Y n , (b) follows from Csiszár's sum identity, (c) follows from the Markov chain e) follows from the definitions of U 1,i and U 2,i , and ( f ) follows from the Markov chain Privacy Leakage to Eve: We have where (a) follows by (7) and from the Markov chain (W Secrecy Leakage (to Eve): We obtain where (a) follows by (3), (b) follows because ( X n 1 , X n 2 , Y n ) are i.i.d., and from Csiszár's sum identity and the Markov chain in (105), (c) follows because (Y n , Z n ) are i.i.d. and from the data processing inequality applied to the Markov chain in (106), (d) follows from the definitions of V 1,i , V 2,i , U 1,i , and U 2,i , (e) follows from the Markov chain given in (107), and ( f ) follows from the Markov chain Introduce a uniformly distributed time-sharing random variable Q ∼ Unif[1 : n] that is independent of other random variables, and define X form Markov chains.The proof of the outer bound follows by letting δ n → 0. Cardinality Bounds: We use the support lemma [39,Lemma 15.4] to prove the cardinality bounds and apply similar steps as in [20,23], so we omit the proof.

Conclusion
We considered the function computation problem, where three nodes observe correlated random variables and aim to compute a target function of their observations at the fusion center node.We modeled the source of the correlation between these nodes by positing that all three random variables are noisy observations of a remote random source.Furthermore, we imposed one secrecy, two privacy, and two storage constraints with operational meanings on this function computation problem to define a lossless rate region by considering an eavesdropper that observes a correlated random variable.The lossless function computation problem was extended by allowing the function computed to be a distorted version of the target function, which defined the lossy function computation problem.
We proposed inner and outer bounds for the lossless and lossy rate regions.The secrecy leakage and privacy leakage rates that are measured with respect to the eavesdropper were shown to be different due to the remote source considered, unlike in the literature.Furthermore, we established simplified rate region bounds for functions that are partially invertible with respect to one of the transmitting node observations as well as for invertible functions.Moreover, we considered two different physical-degradation cases for the measurement channels of the eavesdropper and fusion center when the function computed was invertible.We derived the corresponding rate region bounds, one of which is evaluated as an example scenario.
In future work, we will propose inner and outer bounds for the lossless and lossy multifunction computation problems with multiple transmitting nodes.

Figure 1 .
Figure 1.Single-function computation problem with two transmitting nodes under secrecy, privacy, and storage (or communication) constraints.
48) is satisfied; see also[35,Section IV-A].Using Lemma 4 for the given probability distributions, we evaluate an achievable lossless rate region R for an invertible function computation scenario with two transmitting nodes, in which, e.g., β 1 = 0.2, β 2 = 0.11, α = 0.3, and q = 0.25 and obtain a lossless rate region that is characterized by X 2 , and P V 2 |U 2 such that the pair (U 1 , U 2 ) is admissible for a function f ( X 1 , X 2 , Y), so 1 , W 2 ) − X n − Z n , (b) follows from Csiszár's sum identity, (c) follows from the Markov chain in (111), (d) follows because (X n , Y n , Z n ) are i.i.d., (e) follows from the definitions of V 1,i , V 2,i , U 1,i and U 2,i , and ( f ) follows from the Markov chain 1,i , U 2,i ; X 1,i , X 2,i |Z i ) + I(U 1,i , U 2,i ; Z i |V 1,i , V 2,i ) − I(U 1,i , U 2,i ; Y i |V 1,i , V 2,i )