probability_inequalities

# Probability inequalities¶

## Markov's Inequality:¶

### Core theorem¶

Proposition: Let $X$ be a random variable with $X\geq0$. Then for any positive real number $a$ and if $\mathbb{E}\left[X\right]$ exists:

$$\mathbb{P}(X\geq a)\leq\frac{\mathbb{E}\left[X\right]}{a}$$

Proof By definition, we have $\mathbb{E}\left[X\right]=\int_{\mathcal{X}}xp(x)dx$, which we can split as

$$\mathbb{E}\left[X\right]=\int_{x\geq a}xp(x)dx+\int_{x< a}xp(x)dx$$

and since by assumption $x\geq0$, we have

$$\geq \int_{x\geq a}ap(x)dx=a\cdot\int_{x\geq a}p(x)dx$$

$$=a\cdot\mathbb{P}(X\geq a)\quad\quad\square$$

### Chernoff bounding technique¶

We can use Markov's Inequality to derive for every $t>0$

$$\mathbb{P}(X\geq a)\leq\mathbb{P}(e^{tX}\geq e^{ta})\leq\frac{\mathbb{E}\left[e^{tX}\right]}{e^{ta}}$$

## Chebyshev's Inequality¶

Proposition Let $X$ be any random variable with finite mean and variance, then for any positive real $a$, we have:

$$\mathbb{P}(|X-\mathbb{E}(X)|\geq a)\leq \frac{\mathbb{V}ar(X)}{a^2}$$
Proof Let $Y=(X-\mathbb{E}\left[X\right])^2$ - $Y$ is therefore non-negative with $\mathbb{E}\left[Y\right]=\mathbb{V}ar(X)$. Then use Markov's inequality to obtain
$$\mathbb{P}(Y\geq a^2)\leq\frac{\mathbb{E}\left[Y\right]}{a^2}=\frac{\mathbb{V}ar(X)}{a^2}$$
taking the square root of on the LHS, we get

$$\mathbb{P}(Y\geq a^2)=\mathbb{P}((X-\mathbb{E}\left[X\right])^2\geq a^2)$$

$$=\mathbb{P}(|X-\mathbb{E}\left[X\right]|\geq a)$$

and the result follows from plugging in into the inequality above $\quad\quad\square$

## Jensen's inequality¶

Proposition Let $g(\cdot)$ be a convex function, then for a random variable $X$ if $\mathbb{E}\left[X\right]$ exists:

$$g(\mathbb{E}\left[X\right])\leq\mathbb{E}\left[g(X)\right]$$

Proof With $L(x)=ax+b$ as a line tangent to $g(x)$ at $\mathbb{E}\left[X\right]$, we have

$$\mathbb{E}\left[g(X)\right]\geq\mathbb{E}\left[L(X)\right]=\mathbb{E}\left[(ax+b)\right]=a\mathbb{E}\left[a\right]+b=L(\mathbb{E}\left[X\right])=g(\mathbb{E}\left[X\right])\quad\square$$

## Cauchy-Schwarz inequality¶

Proposition For r.v. $X$ and $Y$, we have

$$|\mathbb{E}\left[XY\right]|^2\leq\mathbb{E}\left[X^2\right]\mathbb{E}\left[Y^2\right]$$

Proof Define $W=(X-aY)^2$, then

$$0\leq \mathbb{E}\left[(X-aY)^2\right]=\mathbb{E}\left[X^2-2aXY+a^2Y^2\right]=$$

$$\mathbb{E}\left[X^2\right]-2a\mathbb{E}\left[XY\right]+a^2\mathbb{E}\left[Y^2\right]\quad\forall a\geq0;$$

setting

$$a=\frac{\mathbb{E}\left[XY\right]}{\mathbb{E}\left[Y^2\right]}$$

we get

$$0\leq\mathbb{E}\left[X^2\right]-2a\mathbb{E}\left[XY\right]+a^2\mathbb{E}\left[Y^2\right]=$$

$$\mathbb{E}\left[X^2\right]-2\frac{\mathbb{E}\left[XY\right]}{\mathbb{E}\left[Y^2\right]}\mathbb{E}\left[XY\right]+\frac{\mathbb{E}\left[XY\right]^2}{\mathbb{E}\left[Y^2\right]^2}\mathbb{E}\left[Y^2\right]=$$

$$\mathbb{E}\left[X^2\right]-\frac{\mathbb{E}\left[XY\right]^2}{\mathbb{E}\left[Y^2\right]^2}$$

and the proposition follows$\quad\square$

## Hoeffding's inequality¶

### 1) Hoeffding's lemma¶

Proposition: For $X$ a random variable with $\mathbb{E}\left[X\right]=0$ and $a\leq X\leq b,\,b>a$ and any $t>0$:

$$\mathbb{E}\left[e^{tX}\right]\leq e^\frac{t^2(b-a)^2}{8}$$

Proof: By convexity of $e^x$, we have

$$e^{tx}\leq\frac{b-x}{b-a}e^{ta}+\frac{x-a}{b-a}e^{tb}$$

setting $\mathbb{E}\left[X\right]=0$:

$$\mathbb{E}\left[e^{tX}\right]\leq\mathbb{E}\left[\frac{b-X}{b-a}e^{ta}+\frac{X-a}{b-a}e^{tb}\right]=\frac{b}{b-a}e^{ta}+\frac{-a}{b-a}e^{tb}=e^{\phi(t)}$$

where

$$\phi(t)=log\left(\frac{b}{b-a}e^{ta}+\frac{-a}{b-a}e^{tb}\right)$$

$$=ta+log\left(\frac{b}{b-a}+\frac{-a}{b-a}e^{t(b-a)}\right).$$

Now, for any $t>0$,

$$\phi'(t)=a-\frac{a}{\frac{b}{b-a}e^{-t(b-a)}-\frac{a}{a-b}}$$

$$\phi''(t)=\frac{\alpha}{\left[(1-\alpha)e^{-t(b-a)}+\alpha\right]} \frac{(1-\alpha) e^{-t(b-a)}}{\left[(1-\alpha)e^{-t(b-a)}+\alpha\right]}(b-a)^2$$

with

$$\alpha = \frac{-a}{b-a}$$

$\phi(0)=\phi'(0)=0$ and $\phi''(t)=u(1-u)(b-a)^2$ with $u=\frac{\alpha}{\left[(1-\alpha)e^{-t(b-a)}+\alpha\right]}$. As $u\in\left[0,1\right]$, it follows that $u(1-u)\leq\frac{1}{4}$ and $\phi''(t)\leq\frac{(b-a)^2}{4}$.

Finally, via Taylor's Theorem (Lemma 2):

$$\phi(t)=\phi(0)+t\phi'(0)+\frac{t^2}{2}\phi''(\theta)\leq t^2\frac{(b-a)^2}{8}$$

from which the proposition follows. $\quad\square$

### 2) Hoeffding's inequality¶

Proposition: Let $X_1,...,X_n$ be independent random variables with $X_i$ taking values in $\left[a_i,b_i\right]\forall i\in n$. Then for any $\epsilon>0$ and $S_n=\sum_{i=1}^nX_i$:

$$\mathbb{P}\left[S_n-\mathbb{E}\left[S_n\right]\geq\epsilon\right]\leq e^\frac{2\epsilon^2}{\sum_{i=1}^n(b_i-a_i)^2}$$

$$\mathbb{P}\left[S_n-\mathbb{E}\left[S_n\right]\leq\epsilon\right]\leq e^\frac{2\epsilon^2}{\sum_{i=1}^n(b_i-a_i)^2}$$

Proof: Via Hoeffding's lemma and the Chernoff bounding technique, we obtain

$$\mathbb{P}\left[S_n-\mathbb{E}\left[S_n\right]\geq\epsilon\right]\leq e^{-t\epsilon}\mathbb{E}\left[e^{t(S_n-\mathbb{E}\left[S_n\right])}\right]$$

$$=e^{-t\epsilon}\prod_{i=1}^n\mathbb{E}\left[e^{t(X_i-\mathbb{E}\left[X_i\right])}\right]$$

$$\leq e^{t\epsilon}\prod_{i=1}^ne^{t^2(b_i-a_i)^2/8}$$

$$=e^{-t\epsilon}e^{t^2\sum_{i=1}^n(b_i-a_i)^2/8}$$

Finally, with $t=\frac{4\epsilon}{\sum_{i=1}^n(b_i-a_i)^2}$ we get

$$e^{-t\epsilon}e^{t^2\sum_{i=1}^n(b_i-a_i)^2/8}$$

$$\leq e^\frac{-2\epsilon^2}{\sum_{i=1}^n(b_i-a_i)^2}$$

Second inequality TBD