# A Forward-Reverse Brascamp-Lieb Inequality: Entropic Duality and Gaussian Optimality

^{1}

^{2}

^{3}

^{*}

## Abstract

**:**

## 1. Introduction

**Brascamp-Lieb Inequality and Its Reverse**

- For all nonnegative functions g and ${f}_{1},\dots ,{f}_{m}$ such that:$$\begin{array}{c}\hfill g(\mathbf{x})\le \prod _{i=1}^{m}{f}_{j}^{{c}_{j}}({\mathbf{B}}_{j}\mathbf{x}),\phantom{\rule{1.em}{0ex}}\forall \mathbf{x},\end{array}$$$$\begin{array}{c}\hfill {\int}_{E}g\le D\prod _{j=1}^{m}{\left({\int}_{{E}_{j}}{f}_{j}\right)}^{{c}_{j}}.\end{array}$$
- For all nonnegative measurable functions ${g}_{1},\dots {g}_{l}$ and f such that:$$\begin{array}{c}\hfill \prod _{i=1}^{l}{g}_{i}^{{b}_{i}}({\mathbf{z}}_{i})\le f(\sum _{i=1}^{l}{b}_{i}{\mathbf{B}}_{i}^{\ast}{\mathbf{z}}_{i}),\phantom{\rule{1.em}{0ex}}\forall {\mathbf{z}}_{1},\dots ,{\mathbf{z}}_{l},\end{array}$$$$\begin{array}{c}\hfill \prod _{i=1}^{l}{\left({\int}_{{E}_{i}}{g}_{i}\right)}^{{b}_{i}}\le D{\int}_{E}f.\end{array}$$

**Theorem 1**(Dual formulation of the forward-reverse Brascamp-Lieb inequality).

- (i)
- m and l are positive integers; $d\in \mathbb{R}$, $\mathcal{X}$ is a compact metric space;
- (ii)
- ${b}_{i}\in (0,\infty )$, ${\nu}_{i}$ is a finite Borel measure on a Polish space ${\mathcal{Z}}_{i}$, and ${Q}_{{Z}_{i}|X}$ is a random transformation from $\mathcal{X}$ to ${\mathcal{Z}}_{i}$, for each $i=1,\dots ,l$;
- (iii)
- ${c}_{j}\in (0,\infty )$, ${\mu}_{j}$ is a finite Borel measure on a Polish space ${\mathcal{Y}}_{j}$, and ${Q}_{{Y}_{j}|X}$ is a random transformation from $\mathcal{X}$ to ${\mathcal{Y}}_{i}$, for each $j=1,\dots ,m$;
- (iv)
- For any ${({P}_{{Z}_{i}})}_{i=1}^{l}$ such that ${\sum}_{i=1}^{l}D({P}_{{Z}_{i}}\parallel {\nu}_{i})<\infty $, there exists ${P}_{X}$ such that ${P}_{X}\to {Q}_{{Z}_{i}|X}\to {P}_{{Z}_{i}}$, $i=1,\dots ,l$ and ${\sum}_{j=1}^{m}D({P}_{{Y}_{j}}\parallel {\mu}_{j})<\infty $, where ${P}_{X}\to {Q}_{{Y}_{j}|X}\to {P}_{{Y}_{j}}$, $j=1,\dots ,m$.

- 1.
- If the nonnegative continuous functions $({g}_{i})$, $({f}_{j})$ are bounded away from zero and satisfy:$$\begin{array}{c}\hfill \sum _{i=1}^{l}{b}_{i}{Q}_{{Z}_{i}|X}({g}_{i})\le \sum _{j=1}^{m}{c}_{j}{Q}_{{Y}_{j}|X}({f}_{j})\end{array}$$$$\begin{array}{c}\hfill \prod _{i=1}^{l}{\left(\int {g}_{i}\mathrm{d}{\nu}_{i}\right)}^{{b}_{i}}\le exp(d)\prod _{j=1}^{m}{\left(\int {f}_{j}\mathrm{d}{\mu}_{j}\right)}^{{c}_{j}}\end{array}$$
- 2.
- For any $({P}_{{Z}_{i}})$ such that $D({P}_{{Z}_{i}}\parallel {\nu}_{i})<\infty $ (of course, this assumption is not essential (if we adopt the convention that the infimum in (14) is $+\infty $ when it runs over an empty set)), $i=1,\dots ,l$,$$\begin{array}{c}\hfill \sum _{i=1}^{l}{b}_{i}D({P}_{{Z}_{i}}\parallel {\nu}_{i})+d\ge \underset{{P}_{X}}{inf}\sum _{j=1}^{m}{c}_{j}D({P}_{{Y}_{j}}\parallel {\mu}_{j})\end{array}$$

**Theorem**

**2.**

**Example**

**1.**

## 2. Review of the Legendre-Fenchel Duality Theory

**Notation**

**1.**

- ${C}_{c}(\mathcal{X})$ denotes the space of continuous functions on $\mathcal{X}$ with a compact support;
- ${C}_{0}(\mathcal{X})$ denotes the space of all continuous functions f on $\mathcal{X}$ that vanish at infinity (i.e., for any $\u03f5>0$, there exists a compact set $\mathcal{K}\subseteq \mathcal{X}$ such that $|f(x)|<\u03f5$ for $x\in \mathcal{X}\backslash \mathcal{K}$);
- ${C}_{b}(\mathcal{X})$ denotes the space of bounded continuous functions on $\mathcal{X}$;
- $\mathcal{M}(\mathcal{X})$ denotes the space of finite signed Borel measures on $\mathcal{X}$;
- $\mathcal{P}(\mathcal{X})$ denotes the space of probability measures on $\mathcal{X}$.

**Theorem 3**(Riesz-Markov-Kakutani).

**Remark**

**1.**

**Definition**

**1.**

**Remark**

**2.**

**Definition**

**2.**

- ${C}_{b}(\varphi ):{C}_{b}({\mathcal{Z}}_{1})\to {C}_{b}({\mathcal{Z}}_{1}\times {\mathcal{Z}}_{2})$ is called a canonical map, whose action is almost trivial: it sends a function of ${z}_{i}$ to itself, but viewed as a function of $({z}_{1},{z}_{2})$.
- $\mathcal{M}(\varphi ):\mathcal{M}({\mathcal{Z}}_{1}\times {\mathcal{Z}}_{2})\to \mathcal{M}({\mathcal{Z}}_{1})$ is called marginalization, which simply takes a joint distribution to a marginal distribution.

**Theorem**

**4.**

**Proof.**

**Theorem 5**(Hahn-Banach)

- 1.
- If the interior of C is non-empty, then there exists $\ell \in {A}^{\ast}$, $\ell \ne 0$ such that:$$\begin{array}{c}\hfill \underset{u\in B}{sup}\ell (u)\le \underset{u\in C}{inf}\ell (u).\end{array}$$
- 2.
- If A is locally convex, B is compact and C is closed, then there exists $\ell \in {A}^{\ast}$ such that:$$\begin{array}{c}\hfill \underset{u\in B}{sup}\ell (u)<\underset{u\in C}{inf}\ell (u).\end{array}$$

**Remark**

**3.**

## 3. The Entropic-Functional Duality

#### 3.1. Compact $\mathcal{X}$

**Proof of Theorem 1.**

**1)⇒2)**- This is the nontrivial direction, which relies on certain (strong) min-max type results. In Theorem 4, put (in (36), $u\le 0$ means that u is pointwise non-positive):$$\begin{array}{c}\hfill {\mathsf{\Theta}}_{0}:u\in {C}_{b}(\mathcal{X})\mapsto \left\{\begin{array}{cc}0& u\le 0;\\ +\infty & \mathrm{otherwise}.\end{array}\right.\end{array}$$$$\begin{array}{c}\hfill {\mathsf{\Theta}}_{0}^{\ast}:\pi \in \mathcal{M}(\mathcal{X})\mapsto \left\{\begin{array}{cc}0& \pi \ge 0;\\ +\infty & \mathrm{otherwise}.\end{array}\right.\end{array}$$$$\begin{array}{c}\hfill {\mathsf{\Theta}}_{j}(u):={c}_{j}inflog{\mu}_{j}\left(exp\left(\frac{1}{{c}_{j}}v\right)\right)\end{array}$$
- ${\mathsf{\Theta}}_{j}$ is convex: indeed, given arbitrary ${u}^{0}$ and ${u}^{1}$, suppose that ${v}^{0}$ and ${v}^{1}$ respectively achieve the infimum in (38) for ${u}^{0}$ and ${u}^{1}$ (if the infimum is not achievable, the argument still goes through by the approximation and limit argument). Then, for any $\alpha \in [0,1]$, ${v}^{\alpha}:=(1-\alpha ){v}^{0}+\alpha {v}^{1}$ satisfies ${u}^{\alpha}={T}_{j}{v}^{\alpha}$ where ${u}^{\alpha}:=(1-\alpha ){u}^{0}+\alpha {u}^{1}$. Thus, the convexity of ${\mathsf{\Theta}}_{j}$ follows from the convexity of the functional in (23);
- ${\mathsf{\Theta}}_{j}(u)>-\infty $ for any $u\in {C}_{b}(\mathcal{X})$. Otherwise, for any ${P}_{X}$ and ${P}_{{Y}_{j}}:={T}_{j}^{\ast}{P}_{X}$, we have:$$\begin{array}{cc}\hfill D({P}_{{Y}_{j}}\parallel {\mu}_{j})& =\underset{v}{sup}\{{P}_{{Y}_{j}}(v)-log{\mu}_{j}(exp(v))\}\hfill \end{array}$$$$\begin{array}{c}=\underset{v}{sup}\{{P}_{X}({T}_{j}v)-log{\mu}_{j}(exp(v))\}\hfill \end{array}$$$$\begin{array}{c}=\underset{u\in {C}_{b}(\mathcal{X})}{sup}\left\{{P}_{X}(u)-\frac{1}{{c}_{j}}{\mathsf{\Theta}}_{j}({c}_{j}u)\right\}\hfill \end{array}$$$$\begin{array}{c}=+\infty \hfill \end{array}$$
- From Steps (39)–(41), we see ${\mathsf{\Theta}}_{j}^{\ast}(\pi )={c}_{j}D({T}_{j}^{\ast}\pi \parallel {\mu}_{j})$ for any $\pi \in \mathcal{M}(\mathcal{X})$, where the definition of $D(\xb7\parallel {\mu}_{j})$ is extended using the Donsker-Varadhan formula (that is, it is infinite when the argument is not a probability measure).

Finally, for the given ${({P}_{{Z}_{i}})}_{i=1}^{l}$, choose:$$\begin{array}{c}\hfill {\mathsf{\Theta}}_{m+1}:u\in {C}_{b}(\mathcal{X})\mapsto \left\{\begin{array}{cc}\sum _{i=1}^{l}{P}_{{Z}_{i}}({w}_{i})& \mathrm{if}u=\sum _{i=1}^{l}{S}_{i}{w}_{i}\mathrm{for}\mathrm{some}{w}_{i}\in {C}_{b}({\mathcal{Z}}_{i});\\ +\infty & \mathrm{otherwise}.\end{array}\right.\end{array}$$- ${\mathsf{\Theta}}_{m+1}$ is convex;
- ${\mathsf{\Theta}}_{m+1}$ is well defined (that is, the choice of $({w}_{i})$ in (43) is inconsequential). Indeed, if ${({w}_{i})}_{i=1}^{l}$ is such that ${\sum}_{i=1}^{l}{S}_{i}{w}_{i}=0$, then:$$\begin{array}{cc}\hfill \sum _{i=1}^{l}{P}_{{Z}_{i}}({w}_{i})& =\sum _{i=1}^{l}{S}_{i}^{\ast}{P}_{X}({w}_{i})\hfill \\ \hfill & =\sum _{i=1}^{l}{P}_{X}({S}_{i}{w}_{i})\hfill \\ \hfill & =0,\hfill \end{array}$$
- $$\begin{array}{cc}\hfill {\mathsf{\Theta}}_{m+1}^{\ast}(\pi )& :=\underset{u}{sup}\{\pi (u)-{\mathsf{\Theta}}_{m+1}(u)\}\hfill \\ \hfill & =\underset{{w}_{1},\dots ,{w}_{l}}{sup}\left\{\pi \left(\sum _{i=1}^{l}{S}_{i}{w}_{i}\right)-\sum _{i=1}^{l}{P}_{{Z}_{i}}({w}_{i})\right\}\hfill \\ \hfill & =\underset{{w}_{1},\dots ,{w}_{l}}{sup}\left\{\sum _{i=1}^{l}{S}_{i}^{\ast}\pi ({w}_{i})-\sum _{i=1}^{l}{P}_{{Z}_{i}}({w}_{i})\right\}\hfill \\ \hfill & =\left\{\begin{array}{cc}0& \mathrm{if}{S}_{i}^{\ast}\pi ={P}_{{Z}_{i}},\phantom{\rule{1.em}{0ex}}i=1,\dots ,l;\\ +\infty & \mathrm{otherwise}.\end{array}\right.\hfill \end{array}$$

Invoking Theorem 4 (where the ${u}_{j}$ in Theorem 4 can be chosen as the constant function ${u}_{j}\equiv 1$, $j=1,\dots ,m+1$):$$\begin{array}{c}\phantom{\rule{1.em}{0ex}}\underset{\pi :\pi \ge 0,\phantom{\rule{0.166667em}{0ex}}{S}_{i}^{\ast}\pi ={P}_{{Z}_{i}}}{inf}\sum _{j=1}^{m}{c}_{j}D({T}_{j}^{\ast}\pi \parallel {\mu}_{j})\hfill \\ =-\underset{{v}^{m},{w}^{l}:{\sum}_{j=1}^{m}{T}_{j}{v}_{j}+{\sum}_{i=1}^{l}{S}_{i}{w}_{i}\ge 0}{inf}\left[\sum _{j=1}^{m}{c}_{j}log{\mu}_{j}\left(exp\left(\frac{1}{{c}_{j}}{v}_{j}\right)\right)+\sum _{i=1}^{l}{P}_{{Z}_{i}}({w}_{i})\right]\hfill \end{array}$$$$\begin{array}{c}\hfill \u03f5-\sum _{j=1}^{m}{c}_{j}log{\mu}_{j}\left(exp\left(\frac{1}{{c}_{j}}{v}_{j}\right)\right)-\sum _{i=1}^{l}{P}_{{Z}_{i}}({w}_{i})>\underset{\pi :\pi \ge 0,\phantom{\rule{0.166667em}{0ex}}{S}_{i}^{\ast}\pi ={P}_{{Z}_{i}}}{inf}\sum _{j=1}^{m}{c}_{j}D({T}_{j}^{\ast}\pi \parallel {\mu}_{j})\end{array}$$$$\begin{array}{c}\hfill \u03f5-\sum _{i=1}^{l}{b}_{i}log{\nu}_{i}({g}_{i})+\sum _{i=1}^{l}{b}_{i}{P}_{{Z}_{i}}(log{g}_{i})\le \u03f5+\sum _{i=1}^{l}{b}_{i}D({P}_{{Z}_{i}}\parallel {\nu}_{i})\end{array}$$ **2)⇒1)**- Since ${\nu}_{i}$ is finite and ${g}_{i}$ is bounded by assumption, we have ${\nu}_{i}({g}_{i})<\infty $, $i=1,\dots ,l$. Moreover, (13) is trivially true when ${\nu}_{i}({g}_{i})=0$ for some i, so we will assume below that ${\nu}_{i}({g}_{i})\in (0,\infty )$ for each i. Define ${P}_{{Z}_{i}}$ by:$$\begin{array}{c}\hfill \frac{\mathrm{d}{P}_{{Z}_{i}}}{\mathrm{d}{\nu}_{i}}=\frac{{g}_{i}}{{\nu}_{i}({g}_{i})},\phantom{\rule{1.em}{0ex}}i=1,\dots ,l.\end{array}$$$$\begin{array}{cc}\hfill \sum _{i=1}^{l}{b}_{i}log{\nu}_{i}({g}_{i})& =\sum _{i=1}^{l}{b}_{i}[{P}_{{Z}_{i}}(log{g}_{i})-D({P}_{{Z}_{i}}\parallel {\nu}_{i})]\hfill \end{array}$$$$\begin{array}{cc}\hfill \phantom{\rule{-4.0pt}{0ex}}& <\sum _{j=1}^{m}{c}_{j}{P}_{{Y}_{j}}(log{f}_{j})+\u03f5-\sum _{j=1}^{m}{c}_{j}D({P}_{{Y}_{j}}\parallel {\mu}_{j})\hfill \end{array}$$$$\begin{array}{cc}\hfill \phantom{\rule{-4.0pt}{0ex}}& \le \u03f5+\sum _{j=1}^{m}{c}_{j}log{\mu}_{j}({f}_{j})\hfill \end{array}$$
- (51) uses the Donsker-Varadhan formula, and we have chosen ${P}_{X}$, ${P}_{{Y}_{j}}:={T}_{j}^{\ast}{P}_{X}$, $j=1,\dots ,m$ such that:$$\begin{array}{c}\hfill \sum _{i=1}^{l}{b}_{i}D({P}_{{Z}_{i}}\parallel {\nu}_{i})>\sum _{j=1}^{m}{c}_{j}D({P}_{{Y}_{j}}\parallel {\mu}_{j})-\u03f5\end{array}$$
- (52) also follows from the Donsker-Varadhan formula.

The result follows since $\u03f5>0$ can be arbitrary.☐

**Remark**

**4.**

**Theorem**

**6.**

- For any ${P}_{X}$ such that $D({S}_{i}^{\ast}{P}_{X}\parallel {\nu}_{i})<\infty $, $i=1,\dots ,l$, there exists ${\tilde{P}}_{X}$ such that ${S}_{i}^{\ast}{\tilde{P}}_{X}={S}_{i}^{\ast}{P}_{X}$ for each i and ${\sum}_{j=1}^{m}{c}_{j}D({T}_{j}^{\ast}{\tilde{P}}_{X}\parallel {\mu}_{j})<\infty $ for each j.

- 1.
- For any nonnegative continuous functions $({g}_{i})$, $({f}_{j})$ bounded away from zero and such that:$$\begin{array}{c}\hfill \sum _{i=1}^{l}{b}_{i}{S}_{i}log{g}_{i}\le \sum _{j=1}^{m}{c}_{j}{T}_{j}log{f}_{j}\end{array}$$$$\begin{array}{c}\hfill \underset{({\tilde{g}}_{i}):{\sum}_{i=1}^{l}{b}_{i}{S}_{i}log{\tilde{g}}_{i}\ge {\sum}_{i=1}^{l}{b}_{i}{S}_{i}log{g}_{i}}{inf}\prod _{i=1}^{l}{\nu}_{i}^{{b}_{i}}({\tilde{g}}_{i})\le exp(d)\prod _{j=1}^{m}{\mu}_{j}^{{c}_{j}}({f}_{j}).\end{array}$$
- 2.
- For any $({P}_{X})$ such that $D({S}_{i}^{\ast}{P}_{X}\parallel {\nu}_{i})<\infty $, $i=1,\dots ,l$,$$\begin{array}{c}\hfill \sum _{i=1}^{l}{b}_{i}D({S}_{i}^{\ast}{P}_{X}\parallel {\nu}_{i})+d\ge \underset{{\tilde{P}}_{X}:{S}_{i}^{\ast}{\tilde{P}}_{X}={S}_{i}^{\ast}{P}_{X}}{inf}\sum _{j=1}^{m}{c}_{j}D({T}_{j}^{\ast}{\tilde{P}}_{X}\parallel {\mu}_{j}).\end{array}$$

**Proof.**

- To see (67), we note that the sup in (66) can be restricted to $\pi $, which is a probability measure, since otherwise, the relative entropy terms in (66) are $+\infty $ by its definition via the Donsker-Varadhan formula. Then, we select ${P}_{X}$ such that (67) holds.
- In (68), we have chosen ${\tilde{P}}_{X}$ such that:$$\begin{array}{cc}\hfill {S}_{i}^{\ast}{\tilde{P}}_{X}& ={S}_{i}^{\ast}{P}_{X},\phantom{\rule{1.em}{0ex}}1\le i\le l;\hfill \end{array}$$$$\begin{array}{cc}\hfill \sum _{i=1}^{l}{b}_{i}D({S}_{i}^{\ast}{P}_{X})& >\sum _{j=1}^{m}{c}_{j}D({T}_{j}^{\ast}{\tilde{P}}_{X}\parallel {\mu}_{j})-\u03f5,\hfill \end{array}$$☐

**Remark**

**5.**

**Remark**

**6.**

#### 3.2. Noncompact $\mathcal{X}$

**Theorem**

**7.**

- The assumption that $\mathcal{X}$ is a compact metric space is relaxed to the assumption that it is a locally compact and σ-compact Polish space;
- $\mathcal{X}={\prod}_{i=1}^{l}{\mathcal{Z}}_{i}$ and ${S}_{i}:{C}_{b}({\mathcal{Z}}_{i})\to {C}_{b}(\mathcal{X})$, $i=1,\dots ,l$ are canonical maps (see Definition 2).

**Proof.**

**Remark**

**7.**

## 4. Gaussian Optimality

#### 4.1. Non-Degenerate Forward Channels

**Assumption**

**1.**

- Fix Lebesgue measures ${({\mu}_{j})}_{j=1}^{m}$ and Gaussian measures ${({\nu}_{i})}_{i=1}^{l}$ on $\mathbb{R}$;
- non-degenerate (Definition 3 below) linear Gaussian random transformation ${({P}_{{Y}_{j}|\mathbf{X}})}_{j=1}^{m}$ (where $\mathbf{X}:=({X}_{1},\dots ,{X}_{l})$) associated with conditional expectation operators ${({T}_{j})}_{j=1}^{m}$;
- ${({S}_{i})}_{i=1}^{l}$ are induced by coordinate projections;
- positive $({c}_{j})$ and $({b}_{i})$.

**Definition**

**3.**

**Theorem**

**8.**

**Proposition**

**1.**

**Proof.**

**Proposition**

**2.**

- 1.
- For any ${({P}_{{X}_{i}})}_{i=1}^{l}$, the infimum in (75) is attained by some Borel ${P}_{\mathbf{X}}$.
- 2.
- If ${({P}_{{Y}_{j}|{X}^{l}})}_{j=1}^{m}$ are non-degenerate (Definition 3), then the supremum in (76) is achieved by some Borel ${({P}_{{X}_{i}})}_{i=1}^{l}$.

**Lemma**

**1.**

**Proof.**

**Proof of Theorem 8.**

- Assume that $({P}_{{X}_{i}^{(1)}})$ and $({P}_{{X}_{i}^{(2)}})$ are maximizers of ${F}_{0}$ (possibly equal). Let ${P}_{{X}_{i}^{1,2}}:={P}_{{X}_{i}^{(1)}}\times {P}_{{X}_{i}^{(2)}}$. Define:$$\begin{array}{cc}\hfill {\mathbf{X}}^{+}& :=\frac{1}{\sqrt{2}}\left({\mathbf{X}}^{(1)}+{\mathbf{X}}^{(2)}\right);\hfill \end{array}$$$$\begin{array}{cc}\hfill {\mathbf{X}}^{-}& :=\frac{1}{\sqrt{2}}\left({\mathbf{X}}^{(1)}-{\mathbf{X}}^{(2)}\right).\hfill \end{array}$$
- Next, we perform the same algebraic expansion as in the proof of tensorization:$$\begin{array}{cc}\hfill \sum _{t=1}^{2}{F}_{0}\left({\left({P}_{{X}_{i}^{(t)}}\right)}_{i=1}^{l}\right)& =\underset{{P}_{{\mathbf{X}}^{(1,2)}}:{S}_{j}^{\ast \otimes 2}{P}_{{\mathbf{X}}^{(1,2)}}={P}_{{X}_{j}^{(1,2)}}}{inf}\sum _{j}{c}_{j}D({P}_{{Y}_{j}^{(1,2)}}\parallel {\mu}_{j}^{\otimes 2})-\sum _{i}{b}_{i}D({P}_{{X}_{i}^{(1,2)}}\parallel {\nu}_{i}^{\otimes 2})\hfill \end{array}$$$$\begin{array}{c}=\underset{{P}_{{\mathbf{X}}^{+}{\mathbf{X}}^{-}}:{S}_{j}^{\ast \otimes 2}{P}_{{\mathbf{X}}^{+}{\mathbf{X}}^{-}}={P}_{{X}_{j}^{+}{X}_{j}^{-}}}{inf}\sum _{j}{c}_{j}D({P}_{{Y}_{j}^{+}{Y}_{j}^{-}}\parallel {\mu}_{j}^{\otimes 2})-\sum _{i}{b}_{i}D({P}_{{X}_{i}^{+}{X}_{i}^{-}}\parallel {\nu}_{i}^{\otimes 2})\hfill \\ \le \underset{{P}_{{\mathbf{X}}^{+}{\mathbf{X}}^{-}}:{S}_{j}^{\ast \otimes 2}{P}_{{\mathbf{X}}^{+}{\mathbf{X}}^{-}}={P}_{{X}_{j}^{+}{X}_{j}^{-}}}{inf}\sum _{j}{c}_{j}\left[D({P}_{{Y}_{j}^{+}}\parallel {\mu}_{j})+D({P}_{{Y}_{j}^{-}|{\mathbf{X}}^{+}}\parallel {\mu}_{j}|{P}_{{\mathbf{X}}^{+}})\right]\hfill \end{array}$$$$\begin{array}{c}\phantom{\rule{1.em}{0ex}}-\sum _{i}{b}_{i}\left[D({P}_{{X}_{i}^{+}}\parallel {\nu}_{i})+D({P}_{{X}_{i}^{-}|{X}_{i}^{+}}\parallel {\nu}_{i}|{P}_{{X}_{i}^{+}})\right]\hfill \end{array}$$$$\begin{array}{c}\le \sum _{j}{c}_{j}\left[D({P}_{{Y}_{j}^{+}}^{\star}\parallel {\mu}_{j})+D({P}_{{Y}_{j}^{-}|{\mathbf{X}}^{+}}^{\star}\parallel {\mu}_{j}|{P}_{{\mathbf{X}}^{+}}^{\star})\right]\hfill \\ \phantom{\rule{1.em}{0ex}}-\sum _{i}{b}_{i}\left[D({P}_{{X}_{i}^{+}}^{\star}\parallel {\nu}_{i})+D({P}_{{X}_{i}^{-}|{\mathbf{X}}^{+}}^{\star}\parallel {\nu}_{i}|{P}_{{\mathbf{X}}^{+}}^{\star})\right]\hfill \end{array}$$$$\begin{array}{c}={F}_{0}\left({\left({P}_{{X}_{i}^{+}}^{\star}\right)}_{i=1}^{l}\right)+\int {F}_{0}\left({\left({P}_{{X}_{i}^{-}|{\mathbf{X}}^{+}}^{\star}\right)}_{i=1}^{l}\right)\mathrm{d}{P}_{{\mathbf{X}}^{+}}^{\star}\hfill \end{array}$$$$\begin{array}{c}\le \sum _{t=1}^{2}{F}_{0}\left({\left({P}_{{X}_{i}^{(t)}}\right)}_{i=1}^{l}\right)\hfill \end{array}$$
- (84) uses Lemma 1.
- (86) is because of the Markov chain ${Y}_{j}^{+}-{\mathbf{X}}^{+}-{Y}_{j}^{-}$ (for any coupling).
- In (87), we selected a particular instance of coupling ${P}_{{\mathbf{X}}^{+}{\mathbf{X}}^{-}}$, constructed as follows: first, we select an optimal coupling ${P}_{{\mathbf{X}}^{+}}$ for given marginals $({P}_{{X}_{i}^{+}})$. Then, for any ${\mathbf{x}}^{+}={({x}_{i}^{+})}_{i=1}^{l}$, let ${P}_{{\mathbf{X}}^{-}|{\mathbf{X}}^{+}={x}^{+}}$ be an optimal coupling of $({P}_{{X}_{i}^{-}|{X}_{i}^{+}={x}_{i}^{+}})$ (for a justification that we can select optimal coupling ${P}_{{\mathbf{X}}^{-}|{\mathbf{X}}^{+}={\mathbf{x}}^{+}}$ in a way that ${P}_{{\mathbf{X}}^{-}|{\mathbf{X}}^{+}}$ is indeed a regular conditional probability distribution, see [7]). With this construction, it is apparent that ${X}_{i}^{+}-{\mathbf{X}}^{+}-{X}_{i}^{-}$, and hence:$$\begin{array}{c}\hfill D({P}_{{X}_{i}^{-}|{X}_{i}^{+}}\parallel {\nu}_{i}|{P}_{{X}_{i}^{+}})=D({P}_{{X}_{i}^{-}|{\mathbf{X}}^{+}}\parallel {\nu}_{i}|{P}_{{\mathbf{X}}^{+}}).\end{array}$$
- (88) is because in the above, we have constructed the coupling optimally.
- (89) is because $({P}_{{X}_{i}}^{(t)})$ maximizes ${F}_{0}$, $t=1,2$.

- Thus, in the expansions above, equalities are attained throughout. Using the differentiation technique as in the case of forward inequality, for almost all $({b}_{i})$, $({c}_{j})$, we have:$$\begin{array}{cc}\hfill D({P}_{{X}_{i}^{-}|{X}_{i}^{+}}\parallel {\nu}_{i}|{P}_{{X}_{i}^{+}})& =D({P}_{{X}_{i}^{+}}\parallel {\nu}_{i})\hfill \end{array}$$$$\begin{array}{c}=D({P}_{{X}_{i}^{-}}\parallel {\nu}_{i}),\phantom{\rule{1.em}{0ex}}\forall i\hfill \end{array}$$☐

#### 4.2. Analysis of Example 1 Using Gaussian Optimality

**Proof sketch for the claim in Example 1.**

**Proposition**

**3.**

**Proof sketch for Proposition 3.**

- There exists $t>0$ such that for every ${a}^{l}\in \mathcal{U}(t)$,$$\begin{array}{c}\hfill \underset{\mathbf{A}\u2ab0\mathbf{0}:{\mathbf{A}}_{ii}={a}_{i}}{sup}\prod _{j=1}^{l}{\left[\mathbf{M}\mathbf{A}{\mathbf{M}}^{\top}\right]}_{jj}=1/{l}^{l}\end{array}$$
- When ${b}^{l}={c}^{l}=(\frac{1}{l},\dots ,\frac{1}{l})$ is the uniform probability vector, (96) equals one, which is uniquely achieved by ${a}^{l}=(\frac{1}{l},\dots ,\frac{1}{l})$. To see the uniqueness, take $\mathbf{A}$ to be diagonal in the denominator and observe that the denominator is strictly bigger than the numerator when the diagonals of $\mathbf{M}\mathbf{A}{\mathbf{M}}^{\top}$ are not a permutation of ${a}^{l}$. Then, since the extreme value of a continuous functions is achieved on a compact set, we can find $\u03f5>0$ such that:$$\begin{array}{c}\hfill \frac{{\prod}_{i=1}^{l}{a}_{i}^{1/l}}{{sup}_{\mathbf{A}\u2ab0\mathbf{0}:{\mathbf{A}}_{ii}={a}_{i}}{\prod}_{j=1}^{l}{\left[\mathbf{M}\mathbf{A}{\mathbf{M}}^{\top}\right]}_{jj}^{1/l}}<1-\u03f5\end{array}$$
- Finally, by continuity, we can choose $s\in (0,t/2)$ small enough such that for any ${b}^{l},{c}^{l}\in \mathcal{U}(s)$,$$\begin{array}{cc}\hfill \frac{{\prod}_{i=1}^{l}{a}_{i}^{{b}_{i}}}{{sup}_{\mathbf{A}\u2ab0\mathbf{0}:{\mathbf{A}}_{ii}={a}_{i}}{\prod}_{j=1}^{l}{\left[\mathbf{M}\mathbf{A}{\mathbf{M}}^{\top}\right]}_{jj}^{{c}_{j}}}& <1-\u03f5/2,\phantom{\rule{1.em}{0ex}}\forall {a}^{l}\notin \mathcal{U}(t/2);\hfill \end{array}$$$$\begin{array}{cc}\hfill \underset{\mathbf{A}\u2ab0\mathbf{0}:{\mathbf{A}}_{ii}={a}_{i}}{sup}\prod _{j=1}^{l}{\left[\mathbf{M}\mathbf{A}{\mathbf{M}}^{\top}\right]}_{jj}^{{c}_{j}}& =\underset{\mathbf{A}:{\mathbf{A}}_{ii}={a}_{i}}{sup}\prod _{j=1}^{l}{\left[\mathbf{M}\mathbf{A}{\mathbf{M}}^{\top}\right]}_{jj}^{{c}_{j}},\phantom{\rule{1.em}{0ex}}\forall {a}^{l}\in \mathcal{U}(t/2);\hfill \end{array}$$$$\begin{array}{cc}\hfill exp(H({c}^{l})-H({b}^{l}))& >1-\u03f5/2.\hfill \end{array}$$

## 5. Relation to Hypercontractivity and Its Reverses

#### 5.1. Hypercontractivity

#### 5.2. Reverse Hypercontractivity (Positive Parameters)

#### 5.3. Reverse Hypercontractivity (One Negative Parameter)

## Author Contributions

## Acknowledgments

## Conflicts of Interest

## Appendix A. Recovering Theorem 1 from Theorem 6 as a Special Case

## Appendix B. Existence of Weakly-Convergent Couplings

**Lemma**

**A1.**

**Proof.**

## Appendix C. Upper Semicontinuity of the Infimum

**Corollary**

**A1.**

**Proof.**

## Appendix D. Weak Semicontinuity of Differential Entropy under a Moment Constraint

**Lemma**

**A2.**

**Remark**

**A1.**

**Proof.**

## Appendix E. Proof of Proposition 2

- For any $\u03f5>0$, by the continuity of measure, there exists $K>0$ such that:$$\begin{array}{c}\hfill {P}_{{X}_{i}}([-K,K])\ge 1-\frac{\u03f5}{l},\phantom{\rule{1.em}{0ex}}i=1,\dots ,l.\end{array}$$$$\begin{array}{c}\hfill {P}_{\mathbf{X}}({[-K,K]}^{l})\ge 1-\u03f5\end{array}$$$$\begin{array}{c}\hfill \underset{n\to \infty}{lim}\sum _{j=1}^{m}{c}_{j}D({P}_{{Y}_{j}}^{(n)}\parallel {\mu}_{j})=\underset{{P}_{\mathbf{X}}}{inf}\sum _{j=1}^{m}{c}_{j}D({P}_{{Y}_{j}}\parallel {\mu}_{j})\end{array}$$$$\begin{array}{c}\hfill \sum _{j=1}^{m}{c}_{j}D({P}_{{Y}_{j}}^{\star}\parallel {\mu}_{j})\le \underset{n\to \infty}{lim}\sum _{j=1}^{m}{c}_{j}D({P}_{{Y}_{j}}^{(n)}\parallel {\mu}_{j})\end{array}$$
- Suppose ${({P}_{{X}_{i}}^{(n)})}_{1\le i\le l,n\ge 1}$ is such that $\mathbb{E}\left[{X}_{i}^{2}\right]\le {\sigma}_{i}^{2}$, ${X}_{i}\sim {P}_{{X}_{i}}^{(n)}$, where $({\sigma}_{i})$ is as in Proposition 1 and:$$\begin{array}{c}\hfill \underset{n\to \infty}{lim}{F}_{0}\left({({P}_{{X}_{i}}^{(n)})}_{i=1}^{l}\right)=\underset{({P}_{{X}_{i}}):{\mathbf{\Sigma}}_{{X}_{i}}\u2aaf{\sigma}_{i}^{2}}{sup}{F}_{0}({({P}_{{X}_{i}})}_{i=1}^{l}).\end{array}$$$$\begin{array}{cc}\hfill \mathbb{E}\left[{X}_{i}^{2}\right]& =\underset{K\to \infty}{lim}\mathbb{E}[min\{{X}_{i}^{2},K\}]\hfill \end{array}$$$$\begin{array}{c}=\underset{K\to \infty}{lim}\mathbb{E}[min\{{({X}_{i}^{(n)})}^{2},K\}]\hfill \end{array}$$$$\begin{array}{c}\le {\sigma}_{i}^{2}\hfill \end{array}$$$$\begin{array}{c}\hfill \sum _{i}{b}_{i}D({P}_{{X}_{i}}^{\star}\parallel {\nu}_{i})\le \underset{n\to \infty}{lim}\sum _{i}{b}_{i}D({P}_{{X}_{i}}^{(n)}\parallel {\nu}_{i})\end{array}$$$$\begin{array}{c}\hfill \sum _{i}{b}_{i}D({P}_{{X}_{i}}^{\star}\parallel {\nu}_{i})<\infty .\end{array}$$$$\begin{array}{cc}\hfill \underset{{P}_{\mathbf{X}}:{S}_{i}^{\ast}{P}_{\mathbf{X}}={P}_{{X}_{i}}^{\star}}{inf}\sum _{j}{c}_{j}D({T}_{j}^{\ast}{P}_{\mathbf{X}}\parallel {\mu}_{j})& \ge \underset{n\to \infty}{lim}\underset{{P}_{\mathbf{X}}:{S}_{i}^{\ast}{P}_{\mathbf{X}}={P}_{{X}_{i}}^{(n)}}{inf}\sum _{j}{c}_{j}D({T}_{j}^{\ast}{P}_{\mathbf{X}}\parallel {\mu}_{j})\hfill \end{array}$$

## Appendix F. Gaussian Optimality in Degenerate Cases: A Limiting Argument

#### Appendix F.1. Proof of the Claim in Example 1

**Lemma**

**A3.**

**Proof.**

**Lemma**

**A4.**

#### Appendix F.2. Proof of Theorem 2

## References

- Brascamp, H.J.; Lieb, E.H. Best constants in Young’s inequality, its converse, and its generalization to more than three functions. Adv. Math.
**1976**, 20, 151–173. [Google Scholar] [CrossRef] - Brascamp, H.J.; Lieb, E.H. On extensions of the Brunn-Minkowski and Prékopa-Leindler theorems, including inequalities for log concave functions, and with an application to the diffusion equation. J. Funct. Anal.
**1976**, 22, 366–389. [Google Scholar] [CrossRef] - Bobkov, S.G.; Ledoux, M. From Brunn-Minkowski to Brascamp-Lieb and to logarithmic Sobolev inequalities. Geom. Funct. Anal.
**2000**, 10, 1028–1052. [Google Scholar] [CrossRef] - Cordero-Erausquin, D. Transport inequalities for log-concave measures, quantitative forms and applications. arXiv, 2015; arXiv:1504.06147. [Google Scholar]
- Barthe, F. On a reverse form of the Brascamp-Lieb inequality. Invent. Math.
**1998**, 134, 335–361. [Google Scholar] [CrossRef] - Bennett, J.; Carbery, A.; Christ, M.; Tao, T. The Brascamp-Lieb inequalities: finiteness, structure and extremals. Geom. Funct. Anal.
**2008**, 17, 1343–1415. [Google Scholar] [CrossRef] - Liu, J.; Courtade, T.A.; Cuff, P.; Verdú, S. Information theoretic perspectives on Brascamp-Lieb inequality and its reverse. arXiv, 2017; arXiv:1702.06260. [Google Scholar]
- Gardner, R. The Brunn-Minkowski inequality. Bull. Am. Math. Soc.
**2002**, 39, 355–405. [Google Scholar] [CrossRef] - Gross, L. Logarithmic Sobolev inequalities. Am. J. Math.
**1975**, 97, 1061–1083. [Google Scholar] [CrossRef] - Erkip, E.; Cover, T.M. The efficiency of investment information. IEEE Trans. Inf. Theory Mar.
**1998**, 44, 1026–1040. [Google Scholar] [CrossRef] - Courtade, T. Outer bounds for multiterminal source coding via a strong data processing inequality. In Proceedings of the IEEE International Symposium on Information Theory, Istanbul, Turkey, 7–12 July 2013; pp. 559–563. [Google Scholar]
- Polyanskiy, Y.; Wu, Y. Dissipation of information in channels with input constraints. IEEE Trans. Inf. Theory
**2016**, 62, 35–55. [Google Scholar] [CrossRef] - Polyanskiy, Y.; Wu, Y. A Note on the Strong Data-Processing Inequalities in Bayesian Networks. Available online: http://arxiv.org/pdf/1508.06025v1.pdf (accessed on 25 August 2015).
- Liu, J.; Cuff, P.; Verdú, S. Key capacity for product sources with application to stationary Gaussian processes. IEEE Trans. Inf. Theory
**2016**, 62, 984–1005. [Google Scholar] - Liu, J.; Cuff, P.; Verdú, S. Secret key generation with one communicator and a one-shot converse via hypercontractivity. In Proceedings of the IEEE International Symposium on Information Theory, Hong Kong, China, 14–19 June 2015; pp. 710–714. [Google Scholar]
- Xu, A.; Raginsky, M. Converses for distributed estimation via strong data processing inequalities. In Proceedings of the IEEE International Symposium on Information Theory, Hong Kong, China, 14–19 June 2015; pp. 2376–2380. [Google Scholar]
- Kamath, S.; Anantharam, V. On non-interactive simulation of joint distributions. arXiv, 2015; arXiv:1505.00769. [Google Scholar]
- Kahn, J.; Kalai, G.; Linial, N. The influence of variables on Boolean functions. In Proceedings of the 29th Annual Symposium on Foundations of Computer Science, White Plains, NY, USA, 24–26 October 1988; pp. 68–80. [Google Scholar]
- Ganor, A.; Kol, G.; Raz, R. Exponential separation of information and communication. In Proceedings of the 2014 IEEE 55th Annual Symposium on Foundations of Computer Science (FOCS), Philadelphia, PA, USA, 18–21 Otctober 2014; pp. 176–185. [Google Scholar]
- Dvir, Z.; Hu, G. Sylvester-Gallai for arrangements of subspaces. arXiv, 2014; arXiv:1412.0795. [Google Scholar]
- Braverman, M.; Garg, A.; Ma, T.; Nguyen, H.L.; Woodruff, D.P. Communication lower bounds for statistical estimation problems via a distributed data processing inequality. arXiv, 2015; arXiv:1506.07216. [Google Scholar]
- Garg, A.; Gurvits, L.; Oliveira, R.; Wigderson, A. Algorithmic aspects of Brascamp-Lieb inequalities. arXiv, 2016; arXiv:1607.06711. [Google Scholar]
- Talagrand, M. On Russo’s approximate zero-one law. Ann. Probab.
**1994**, 22, 1576–1587. [Google Scholar] [CrossRef] - Friedgut, E.; Kalai, G.; Naor, A. Boolean functions whose Fourier transform is concentrated on the first two levels. Adv. Appl. Math.
**2002**, 29, 427–437. [Google Scholar] [CrossRef] - Bourgain, J. On the distribution of the Fourier spectrum of Boolean functions. Isr. J. Math.
**2002**, 131, 269–276. [Google Scholar] [CrossRef] - Mossel, E.; O’Donnell, R.; Oleszkiewicz, K. Noise stability of functions with low influences: Invariance and optimality. Ann. Math.
**2010**, 171, 295–341. [Google Scholar] [CrossRef] - Garban, C.; Pete, G.; Schramm, O. The Fourier spectrum of critical percolation. Acta Math.
**2010**, 205, 19–104. [Google Scholar] [CrossRef] - Duchi, J.C.; Jordan, M.; Wainwright, M.J. Local privacy and statistical minimax rates. In Proceedings of the IEEE 54th Annual Symposium on Foundations of Computer Science (FOCS), Berkeley, CA, USA, 26–29 October 2013; pp. 429–438. [Google Scholar]
- Lieb, E.H. Gaussian kernels have only Gaussian maximizers. Invent. Math.
**1990**, 102, 179–208. [Google Scholar] [CrossRef] - Barthe, F. Optimal Young’s inequality and its converse: A simple proof. Geom. Funct. Anal.
**1998**, 8, 234–242. [Google Scholar] [CrossRef] - Barthe, F.; Cordero-Erausquin, D. Inverse Brascamp-Lieb inequalities along the Heat equation. In Geometric Aspects of Functional Analysis; Lecture Notes in Mathematics; Springer: Berlin/Heidelberg, Germany, 2004; Volume 1850, pp. 65–71. [Google Scholar]
- Carlen, E.A.; Cordero-Erausquin, D. Subadditivity of the entropy and its relation to Brascamp-Lieb type inequalities. Geom. Funct. Anal.
**2009**, 19, 373–405. [Google Scholar] [CrossRef] - Barthe, F.; Cordero-Erausquin, D.; Ledoux, M.; Maurey, B. Correlation and Brascamp-Lieb inequalities for Markov semigroups. Int. Math. Res. Notices
**2011**, 2011, 2177–2216. [Google Scholar] [CrossRef] - Lehec, J. Short probabilistic proof of the Brascamp-Lieb and Barthe theorems. Can. Math. Bull.
**2014**, 57, 585–587. [Google Scholar] [CrossRef] - Ball, K. Volumes of sections of cubes and related problems. In Geometric Aspects of Functional Analysis; Springer: Berlin/Heidelberg, Germany, 1989; pp. 251–260. [Google Scholar]
- Ahlswede, R.; Gács, P. Spreading of sets in product spaces and hypercontraction of the Markov operator. Ann. Probab.
**1976**, 4, 925–939. [Google Scholar] [CrossRef] - Csiszár, I.; Körner, J. Information Theory: Coding Theorems for Discrete Memoryless Systems, 2nd ed.; Cambridge University Press: Cambridge, UK, 2011. [Google Scholar]
- Liu, J.; van Handel, R.; Verdú, S. Beyond the Blowing-Up Lemma: Sharp Converses via Reverse Hypercontractivity. In Proceedings of the IEEE International Symposium on Information Theory, Aachen, Germany, 25–30 June 2017; pp. 943–947. [Google Scholar]
- Ahlswede, R.; Gács, P.; Körner, J. Bounds on conditional probabilities with applications in multi-user communication. Probab. Theory Relat. Fields
**1976**, 34, 157–177. [Google Scholar] [CrossRef] - Villani, C. Topics in Optimal Transportation; American Mathematical Soc.: Providence, RI, USA, 2003; Volume 58. [Google Scholar]
- Atar, R.; Merhav, N. Information-theoretic applications of the logarithmic probability comparison bound. IEEE Trans. Inf. Theory
**2015**, 61, 5366–5386. [Google Scholar] [CrossRef] - Radhakrishnan, J. Entropy and counting. In Kharagpur Golden Jubilee Volume; Narosa: New Delhi, India, 2001. [Google Scholar]
- Madiman, M.M.; Tetali, P. Information inequalities for joint distributions, with interpretations and applications. IEEE Trans. Inf. Theory
**2010**, 56, 2699–2713. [Google Scholar] [CrossRef] - Nair, C. Equivalent Formulations of Hypercontractivity Using Information Measures; International Zurich Seminar: Zurich, Switzerland, 2014. [Google Scholar]
- Beigi, S.; Nair, C. Equivalent characterization of reverse Brascamp-Lieb type inequalities using information measures. In Proceedings of the IEEE International Symposium on Information Theory, Barcelona, Spain, 10–15 July 2016. [Google Scholar]
- Bobkov, S.G.; Götze, F. Exponential integrability and transportation cost related to Logarithmic Sobolev inequalities. J. Funct. Anal.
**1999**, 163, 1–28. [Google Scholar] [CrossRef] - Carlen, E.A.; Lieb, E.H.; Loss, M. A sharp analog of Young’s inequality on S
^{N}and related entropy inequalities. J. Geom. Anal.**2004**, 14, 487–520. [Google Scholar] [CrossRef] - Geng, Y.; Nair, C. The capacity region of the two-receiver Gaussian vector broadcast channel with private and common messages. IEEE Trans. Inf. Theory
**2014**, 60, 2087–2104. [Google Scholar] [CrossRef] - Liu, J.; Courtade, T.A.; Cuff, P.; Verdú, S. Brascamp-Lieb inequality and its reverse: An information theoretic view. In Proceedings of the IEEE International Symposium on Information Theory, Barcelona, Spain, 10–15 July 2016; pp. 1048–1052. [Google Scholar]
- Lax, P.D. Functional Analysis; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 2002. [Google Scholar]
- Tao, T. 245B, Notes 12: Continuous Functions on Locally Compact Hausdorff Spaces. Available online: https://terrytao.wordpress.com/2009/03/02/245b-notes-12-continuous-functions-on-locally-compact-hausdorff-spaces/ (accessed on 2 March 2009).
- Bourbaki, N. Intégration; (Chaps. I-IV, Actualités Scientifiques et Industrielles, no. 1175); Hermann: Paris, France, 1952. [Google Scholar]
- Dembo, A.; Zeitouni, O. Large Deviations Techniques and Applications; Springer: Berlin, Germany, 2009; Volume 38. [Google Scholar]
- Lane, S.M. Categories for the Working Mathematician; Springer: New York, NY, USA, 1978. [Google Scholar]
- Hatcher, A. Algebraic Topology; Tsinghua University Press: Beijing, China, 2002. [Google Scholar]
- Rockafellar, R.T. Convex Analysis; Princeton University Press: Princeton, NJ, USA, 2015. [Google Scholar]
- Prokhorov, Y.V. Convergence of random processes and limit theorems in probability theory. Theory Probab. Its Appl.
**1956**, 1, 157–214. [Google Scholar] [CrossRef] - Verdú, S. Information Theory; In preparation; 2018. [Google Scholar]
- Kamath, S. Reverse hypercontractivity using information measures. In Proceedings of the 53rd Annual Allerton Conference on Communications, Control and Computing, Champaign, IL, USA, 30 September–2 October 2015; pp. 627–633. [Google Scholar]
- Wu, Y.; Verdú, S. Functional properties of minimum mean-square error and mutual information. IEEE Trans. Inf. Theory
**2012**, 58, 1289–1301. [Google Scholar] [CrossRef] - Godavarti, M.; Hero, A. Convergence of differential entropies. IEEE Trans. Inf. Theory
**2004**, 50, 171–176. [Google Scholar] [CrossRef]

**Figure 2.**The forward-reverse Brascamp-Lieb inequality generalizes several other functional inequalities/information theoretic inequalities. For more discussions on these relations, see the extended version [7].

© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Liu, J.; Courtade, T.A.; Cuff, P.W.; Verdú, S. A Forward-Reverse Brascamp-Lieb Inequality: Entropic Duality and Gaussian Optimality. *Entropy* **2018**, *20*, 418.
https://doi.org/10.3390/e20060418

**AMA Style**

Liu J, Courtade TA, Cuff PW, Verdú S. A Forward-Reverse Brascamp-Lieb Inequality: Entropic Duality and Gaussian Optimality. *Entropy*. 2018; 20(6):418.
https://doi.org/10.3390/e20060418

**Chicago/Turabian Style**

Liu, Jingbo, Thomas A. Courtade, Paul W. Cuff, and Sergio Verdú. 2018. "A Forward-Reverse Brascamp-Lieb Inequality: Entropic Duality and Gaussian Optimality" *Entropy* 20, no. 6: 418.
https://doi.org/10.3390/e20060418