Asymptotic Rate-Distortion Analysis of Symmetric Remote Gaussian Source Coding: Centralized Encoding vs. Distributed Encoding

Consider a symmetric multivariate Gaussian source with ℓ components, which are corrupted by independent and identically distributed Gaussian noises; these noisy components are compressed at a certain rate, and the compressed version is leveraged to reconstruct the source subject to a mean squared error distortion constraint. The rate-distortion analysis is performed for two scenarios: centralized encoding (where the noisy source components are jointly compressed) and distributed encoding (where the noisy source components are separately compressed). It is shown, among other things, that the gap between the rate-distortion functions associated with these two scenarios admits a simple characterization in the large ℓ limit.


Introduction
Many applications involve collection and transmission of potentially noise-corrupted data. It is often necessary to compress the collected data to reduce the transmission cost. The remote source coding problem aims to characterize the optimal scheme for such compression and the relevant information-theoretic limit. In this work we study a quadratic Gaussian version of the remote source coding problem, where compression is performed on the noise-corrupted components of a symmetric multivariate Gaussian source. A prescribed mean squared error distortion constraint is imposed on the reconstruction of the noise-free source components; moreover, it is assumed that the noises across different source components are independent and obey the same Gaussian distribution. Two scenarios are considered: centralized encoding (see Figure 1) and distributed encoding (see Figure 2). It is worth noting that the distributed encoding scenario is closely related to the CEO problem, which has been studied extensively [1][2][3][4][5][6][7][8][9][10][11][12][13][14][15][16][17][18].
The present paper is primarily devoted to the comparison of the rate-distortion functions associated with the aforementioned two scenarios. We are particularly interested in understanding how the rate penalty for distributed encoding (relative to centralized encoding) depends on the target distortion as well as the parameters of source and noise models. Although the information-theoretic results needed for this comparison are available in the literature or can be derived in a relatively straightforward manner, the relevant expressions are too unwieldy to analyze. For this reason, we focus on the asymptotic regime where the number of source components, denoted by , is sufficiently large. Indeed, it will be seen that the gap between the two rate-distortion functions admits a simple characterization in the large limit, yielding useful insights into the fundamental difference between centralized encoding and distributed coding, which are hard to obtain otherwise.
The rest of this paper is organized as follows. We state the problem definitions and the main results in Section 2. The proofs are provided in Section 3. We conclude the paper in Section 4.  Notation: The expectation operator and the transpose operator are denoted by E[·] and (·) T , respectively. An -dimensional all-one row vector is written as 1 . We use W n as an abbreviation of (W(1), · · · , W(n)). The cardinality of a set C is denoted by |C|. We write g( ) = O( f ( )) if the absolute value of g( ) f ( ) is bounded for all sufficiently large . Throughout this paper, the base of the logarithm function is e, and log + x max{log x, 0}.

Definition 1 (Centralized encoding).
A rate-distortion pair (r, d) is said to be achievable with centralized encoding if, for any > 0, there exists an encoding function φ (n) : R ×n → C (n) such that . For a given d, we denote by r(d) the minimum r such that (r, d) is achievable with centralized encoding.

Definition 2 (Distributed encoding).
A rate-distortion pair (r, d) is said to be achievable with distributed encoding if, for any > 0, there exist encoding functions φ . For a given d, we denote by r(d) the minimum r such that (r, d) is achievable with distributed encoding.
We will refer to r(d) as the rate-distortion function of symmetric remote Gaussian source coding with centralized encoding, and r(d) as the rate-distortion function of symmetric remote Gaussian source coding with distributed encoding. It is clear that r(d) ≤ r(d) for any d since distributed encoding can be simulated by centralized encoding. Moreover, it is easy to show that r(d) = r(d) = 0 for d ≥ γ X (since the distortion constraint is trivially satisfied with the reconstruction set to be zero) and r(d) = r(d) = ∞ for d ≤ d min (since d min is the minimum achievable distortion when {S(t)} ∞ t=1 is directly available at the decoder), where (see Section 3.1 for a detailed derivation) Henceforth we shall focus on the case d ∈ (d min , γ X ).
The expressions of r(d) and r(d) as shown in Lemmas 1 and 2 are quite complicated, rendering it difficult to make analytical comparisons. Fortunately, they become significantly simplified in the asymptotic regime where → ∞ (with d fixed). To perform this asymptotic analysis, it is necessary to restrict attention to the case ρ X ∈ [0, 1]; moreover, without loss of generality, we assume d ∈ (d Theorem 1 (Centralized encoding).
The following result is a simple corollary of Theorems 1 and 2.

Proof of Lemma 1
It is known [21] that r(d) is given by the solution to the following optimization problem: For the same reason, we have Denote the i-th components ofX,Z, andS byX i ,Z i , andS i , respectively, i = 1, · · · , . Clearly, S i =X i +Z i , i = 1, · · · , . Moreover, it can be verified thatX 1 , · · · ,X ,Z 1 , · · · ,Z are independent zero-mean Gaussian random variables with Now denote the i-th component ofŜ E[X|S] byŜ i , i = 1, · · · , . We havê Note that which, together with (1)-(5), proves Clearly,Ŝ is determined byS; moreover, for any -dimensional random vectorX jointly distributed with (X,S) such thatX ↔S ↔X form a Markov chain, we have Therefore, (P 2 ) is equivalent to One can readily complete the proof of Lemma 1 by recognizing that the solution to (P 3 ) is given by the well-known reverse water-filling formula ([22] Theorem 13.3.3).

Proof of Theorem 1
Setting ρ X = 0 in Lemma 1 gives , γ X ); moreover, we have It remains to treat the case ρ X ∈ (0, 1). In this case, it can be deduced from Lemma 1 that Consider the following two subcases separately.
It can be seen from (6) that d min is a monotonically decreasing function of and converges to when is sufficiently large. Note that and where (10) is due to (7). Substituting (9) and (11) into (8) gives In particular, we have when is sufficiently large. One can readily verify that Substituting (13) into (12) gives This completes the proof of Theorem 1.

Proof of Theorem 2
One can readily prove part one of Theorem 2 by setting ρ X = 0 in Lemma 2. So only part two of Theorem 2 remains to be proved. Note that b = g 1 2 + g 2 , We shall consider the following three cases separately.
• d < λ X In this case g 1 > 0 and consequently when is sufficiently large. Note that Substituting (15) into (14) gives It is easy to show that Combining (16), (17) and (18) yields Moreover, it can be verified via algebraic manipulations that Now we write r(d) equivalently as Note that and Substituting (20) and (21) into (19) gives In this case g 1 = 0 and consequently Note that Substituting (23) into (22) gives Moreover, it can be verified via algebraic manipulations that Now we proceed to derive an asymptotic expression of r(d). Note that and Substituting (24) and (25) into (19) gives In this case g 1 < 0 and consequently when is sufficiently large. Note that Substituting (27) into (26) gives It is easy to show that Combining (28) and (29) yields Now we proceed to derive an asymptotic expression of r(d). Note that and Substituting (30) and (31) into (19) gives This completes the proof of Theorem 2.

Conclusions
We have studied the problem of symmetric remote Gaussian source coding and made a systematic comparison of centralized encoding and distributed encoding in terms of the asymptotic rate-distortion performance. It is of great interest to extend our work by considering more general source and noise models.

Acknowledgments:
The authors wish to thank the anonymous reviewer for their valuable comments and suggestions.

Conflicts of Interest:
The authors declare no conflict of interest.