Appendix M — Appendix: Inequalities

M.1 Markov’s inequality

Markov’s inequality is a foundational result in measure theory and probability theory that provides an upper bound on the probability that a non-negative random variable exceeds a certain threshold. It is particularly useful for establishing the existence of moments and for proving other inequalities.

Theorem M.1 (Markov’s Inequality) Let X be a non-negative random variable and let a > 0. Then,

\Pr(X \geq a) \leq \frac{\mathbb{E}[X]}{a}.

M.2 Chebyshev’s inequality

Chebyshev’s inequality is a powerful tool in probability theory that provides an upper bound on the probability that a random variable deviates from its mean. It is particularly useful for establishing the concentration of measure and for proving other inequalities.

Theorem M.2 (Chebyshev’s Inequality) Let X be a random variable with mean \mu = \mathbb{E}[X] and variance \sigma^2 = \operatorname{Var}(X). Then for any k > 0, \Pr(|X - \mu| \geq k\sigma) \leq \frac{1}{k^2}.

Theorem M.3 (Measure-theoretic Chebyshev’s Inequality) Let (\Omega, \mathcal{F}, \mathbb{P}) be a probability space and let X be a random variable measurable with respect to \mathcal{F}. If \mu = \mathbb{E}[X] and \sigma^2 = \operatorname{Var}(X), then for any k > 0, \Pr(|X - \mu| \geq k\sigma) \leq \frac{1}{k^2}.

M.3 Cantelli’s inequality

Cantelli’s inequality is a refinement of Chebyshev’s inequality that provides a one-sided bound on the probability that a random variable deviates from its mean. It is particularly useful in statistical inference and hypothesis testing.

Theorem M.4 (Cantelli’s Inequality) Let X be a random variable with mean \mu = \mathbb{E}[X] and variance \sigma^2 = \operatorname{Var}(X). Then for any k > 0, \Pr(X - \mu \geq k\sigma) \leq \frac{1}{1 + k^2}.

M.4 Bhattacharyya’s inequality

Bhattacharyya’s inequality is a refinement of Cantelli’s inequality that provides a two-sided bound on the probability that a random variable deviates from its mean. It is particularly useful in statistical inference and hypothesis testing.

The neat idea is that it uses the third and fourth moments of the distribution to do this.

Theorem M.5 (Bhattacharyya’s Inequality) Let X be a random variable with mean \mu = \mathbb{E}[X] and variance \sigma^2 = \operatorname{Var}(X). Then for any k > 0, \Pr(|X - \mu| \geq k\sigma) \leq \frac{2}{1 + k^2}.

M.5 Kolmogorov’s inequality

Kolmogorov’s inequality is a fundamental result in probability theory that provides an upper bound on the probability of the maximum absolute value of a sum of independent random variables exceeding a certain threshold. It is particularly useful in the context of stochastic processes and random walks.

It can be used to prove the weak law of large numbers.
It can be used like the empirical rule but for a broad class of distributions. Stating that at least 75% of the values lie within two standard deviations of the mean and at least 89% of the values lie within three standard deviations of the mean.

Theorem M.6 (Kolmogorov’s Inequality) Let X_1, X_2, \ldots, X_n be a sequence of independent random variables with zero expectation and finite variances. Then for any \lambda \geq 0,

\Pr \left(\max _{1\leq k\leq n}|S_{k}|\geq \lambda \right)\leq {\frac {1}{\lambda ^{2}}}\operatorname {Var} [S_{n}]\equiv {\frac {1}{\lambda ^{2}}}\sum _{k=1}^{n}\operatorname {Var} [X_{k}]={\frac {1}{\lambda ^{2}}}\sum _{k=1}^{n}{\text{E}}[X_{k}^{2}], \tag{M.1}

where S_k = X_1 + X_2 + \ldots + X_k is the partial sum of the first k random variables.

--- title: "Appendix: Inequalities" subtitle: "Appendix" description: "This appendix explains the Moore-Penrose inversion and the Cholesky decomposition, which are methods for solving linear equations efficiently using linear algebra." categories: - "Probability and Statistics" keywords: - "Kolmogorov's inequality" - "Cholesky decomposition" - "linear equations" - "matrix factorization" - "numerical stability" - "linear algebra" --- ## Markov's inequality {#sec-markov-inequality} Markov's inequality is a foundational result in measure theory and probability theory that provides an upper bound on the probability that a non-negative random variable exceeds a certain threshold. It is particularly useful for establishing the existence of moments and for proving other inequalities. ::: {#thm-markov-inequality} ## Markov's Inequality Let $X$ be a non-negative random variable and let $a > 0$. Then, $$ \Pr(X \geq a) \leq \frac{\mathbb{E}[X]}{a}. $$ ::: ## Chebyshev's inequality {#sec-chebyshev-inequality} Chebyshev's inequality is a powerful tool in probability theory that provides an upper bound on the probability that a random variable deviates from its mean. It is particularly useful for establishing the concentration of measure and for proving other inequalities. ::: {#thm-chebyshev-inequality} ## Chebyshev's Inequality Let $X$ be a random variable with mean $\mu = \mathbb{E}[X]$ and variance $\sigma^2 = \operatorname{Var}(X)$. Then for any $k > 0$, $$ \Pr(|X - \mu| \geq k\sigma) \leq \frac{1}{k^2}. $$ ::: ::: {#thm-chebyshev-inequality-measure-theoretic} ## Measure-theoretic Chebyshev's Inequality Let $(\Omega, \mathcal{F}, \mathbb{P})$ be a probability space and let $X$ be a random variable measurable with respect to $\mathcal{F}$. If $\mu = \mathbb{E}[X]$ and $\sigma^2 = \operatorname{Var}(X)$, then for any $k > 0$, $$ \Pr(|X - \mu| \geq k\sigma) \leq \frac{1}{k^2}. $$ ::: ## Cantelli's inequality {#sec-cantellis-inequality} Cantelli's inequality is a refinement of Chebyshev's inequality that provides a one-sided bound on the probability that a random variable deviates from its mean. It is particularly useful in statistical inference and hypothesis testing. ::: {#thm-cantellis-inequality} ## Cantelli's Inequality Let $X$ be a random variable with mean $\mu = \mathbb{E}[X]$ and variance $\sigma^2 = \operatorname{Var}(X)$. Then for any $k > 0$, $$ \Pr(X - \mu \geq k\sigma) \leq \frac{1}{1 + k^2}. $$ ::: ## Bhattacharyya's inequality {#sec-bhattacharyyas-inequality} Bhattacharyya's inequality is a refinement of Cantelli's inequality that provides a two-sided bound on the probability that a random variable deviates from its mean. It is particularly useful in statistical inference and hypothesis testing. The neat idea is that it uses the third and fourth moments of the distribution to do this. ::: {#thm-bhattacharyyas-inequality} ## Bhattacharyya's Inequality Let $X$ be a random variable with mean $\mu = \mathbb{E}[X]$ and variance $\sigma^2 = \operatorname{Var}(X)$. Then for any $k > 0$, $$ \Pr(|X - \mu| \geq k\sigma) \leq \frac{2}{1 + k^2}. $$ ::: ## Kolmogorov's inequality {#sec-kolmogorov-inequality} Kolmogorov's inequality is a fundamental result in probability theory that provides an upper bound on the probability of the maximum absolute value of a sum of independent random variables exceeding a certain threshold. It is particularly useful in the context of stochastic processes and random walks. - It can be used to prove the weak law of large numbers. - It can be used like the empirical rule but for a broad class of distributions. Stating that at least 75% of the values lie within two standard deviations of the mean and at least 89% of the values lie within three standard deviations of the mean. ::: {#thm-kolmogorov-inequality} ## Kolmogorov's Inequality Let $X_1, X_2, \ldots, X_n$ be a sequence of independent random variables with zero expectation and finite variances. Then for any $\lambda \geq 0$, $$ \Pr \left(\max _{1\leq k\leq n}|S_{k}|\geq \lambda \right)\leq {\frac {1}{\lambda ^{2}}}\operatorname {Var} [S_{n}]\equiv {\frac {1}{\lambda ^{2}}}\sum _{k=1}^{n}\operatorname {Var} [X_{k}]={\frac {1}{\lambda ^{2}}}\sum _{k=1}^{n}{\text{E}}[X_{k}^{2}], $$ {#eq-kogmogorov-inequality} where $S_k = X_1 + X_2 + \ldots + X_k$ is the partial sum of the first $k$ random variables. :::