1. Introduction

In this tutorial, we’ll explain statistical independence.

2. Why Is the Independence of Events Important?

Merriam-Webster lists several meanings of the word independent. In mathematics, we usually go with a variation of “not determined by or capable of being deduced or derived from or expressed in terms of members (such as axioms or equations) of the set under consideration“.

Something similar holds in probability theory and statistics. We say that two events are statistically independent if the probability of one doesn’t change if we learn that the other event got realized (or didn’t occur) and vice versa.

But why do we bother with this?

Let’s say we’re developing a vaccine and want to check if an expensive component improves the protection. We need to test the vaccine on two randomly collected groups of test subjects. One group gets the expensive version, and the other receives the version without the component. If the incidence of the disease is the same in both groups, the chance that the vaccine protects against it won’t change if we add the component.

So, the events “the vaccine contains this very expensive component” and “the vaccine protects those who get a shot” will be independent. Consequently, we can produce the vaccine without an expensive part and save money.

3. Statistical Independence of Two Events

The independence of events has a precisely defined meaning in statistics. Events \boldsymbol{A} and \boldsymbol{B} are independent of one another (which we write as \boldsymbol{A \perp B}) if their joint probability is the product of their individual probabilities:

(1)   \begin{equation*}  P(A \cap B) = P(A) \times P(B) \end{equation*}

For instance, if P(A) = 0.5 and P(B)=0.7, the joint probability P(A \cap B) needs to be 0.35 for A and B to be independent.

3.1. Conditional Interpretation

We can make independence more intuitive by using conditional probabilities. Let’s recall that:

    \[P(A \cap B) = P(A \mid B) \times P(B) = P(B \mid A) \times P(A)\]

So, if A and B are mutually independent, we get:

    \[P(A) \times P(B) &= P(A \cap B) = P(A \mid B) \times P(B)\]

From there:

    \[\begin{aligned} P(A) &= P(A \mid B) \\ P(B) &= P(B \mid A) \end{aligned}\]

Therefore, if A \perp B, the realization of \boldsymbol{B} doesn’t change the probability that \boldsymbol{A} will happen and vice versa.

For instance, if we flip a coin and get a head, the chances for the second toss to result in either side remain unaffected by the first outcome.

3.2. Visualization

Let’s visualize the probabilities in question as surfaces. If A \perp B, the proportion the surface that A \cap B takes in the area of B should be the same as the proportion A takes in the entire event space \Omega:

Visualized independence

4. Statistical Independence of Multiple Events

There are two notions of independence for multiple events.

4.1. Pairwise Independence

We say that events \boldsymbol{A_1, A_2, \ldots, A_n} are pairwise independent if any two events \boldsymbol{A_i, A_j \subset \{A_1, A_2, \ldots, A_n\}} are independent of one another in the sense of Equation (1}:

(2)   \begin{equation*}  (\forall i, j \in \{1, 2, \ldots, n\}) i \neq j \implies A_i \per A_j \end{equation*}

However, that doesn’t mean that the intersections of three and more events decompose to the individual events’ probabilities.

Here’s a classical example illustrating this. If we toss a coin two times, we have four possible outcomes: \{HH, HT, TH, TT\}. Let’s define three events:

  • A: the first toss gives us a head
  • B: the first toss yields a tail
  • C: the same side appears both times


Event space

Assuming that the coin is fair, all the outcomes are equally likely, so P(A) = P(B) = P(C) = \frac{2}{4} = \frac{1}{2}.


    \[\begin{aligned} P(A \cap B) &= \frac{1}{4} = \frac{1}{2} \times \frac{1}{2} = P(A) \times P(B) \\ P(A \cap C) &= \frac{1}{4} = \frac{1}{2} \times \frac{1}{2} = P(A) \times P(C) \\ P(B \cap C) &= \frac{1}{4} = \frac{1}{2} \times \frac{1}{2} = P(B) \times P(C) \\ \end{aligned}\]

As we see, the events are pairwise independent. However:

    \[P(A \cap B \cap C) = \frac{1}{4} \neq \frac{1}{8} = P(A) \times P(B) \times P(C)\]

So, A isn’t independent of B \cap C, B of A \cap C, and C of A \cap B. That’s what we have mutual independence for.

4.2. Mutually Independent Events

Events \boldsymbol{A_1, A_2, \ldots, A_n} are mutually independent if any event is independent of any intersection of other events from the set.

So, we want the following to hold for any A_i and A_{j_1}, A_{j_2}, \ldots, A_{j_k} such that k \leq n and i \neq j_{\ell} for any \ell=1,2,\ldots,k:

    \[P \left(A_i \cap \left( \bigcap\limits_{\ell=1}^{k} A_{j_{\ell}} \right) \right) = P(A_i) \times P\left( \bigcap\limits_{\ell=1}^{k} A_{j_{\ell}} \right)\]

But, this also needs to hold for all the events and intersections in A_{j_1}, A_{j_2}, \ldots, A_{j_k}. So, we can write the condition compactly as:

(3)   \begin{equation*} (\forall k = 2, 3, \ldots, n) \quad  \{i_1, i_2, \ldots, i_k\} \subseteq \{1,2,\ldots,n\} \implies P \left( \bigcap\limits_{\ell=1}^{k} A_{\ell} \right) = \prod_{\ell=1}^{k} P(A_{\ell}) \end{equation*}

Therefore, mutually independent events are also pairwise independent.

5. Conditional Independence

Since conditional probability is also a probability, there’s also conditional independence:

(4)   \begin{equation*} P(A \cap B \mid C ) = P(A \mid C) \times P(B \mid C) \end{equation*}

If the above holds for events A, B, and C, we say that A and B are conditionally independent given C.

Intuitively, that means that if C happened, the realization of B doesn’t affect the chance of A occurring. Or, in line with Bayesianism, any information on B reveals nothing about A provided that we know C has happened.

6. Independence of Random Variables

We can use the above notions of independence to define the independence of random variables. Since the same reasoning holds for pairwise and mutually independent random variables, we’ll focus on the independence of only two variables.

Variables \boldsymbol{A} and \boldsymbol{B} are stochastically independent if the events \boldsymbol{\{A \leq a\}} and \boldsymbol{\{B \leq b\}} are independent for any \boldsymbol{a} and \boldsymbol{b}.

If F_A and F_B denote the variables’ cumulative distribution functions, and F_{AB} their joint CDF, the definition comes down to:

(5)   \begin{equation*} F_{AB}(a, b)=F_A(a) \times F_B(b) \end{equation*}

7. Statistical vs. Everyday-Language Independence

The notions of independence in everyday language differ from those in probability and statistics.

Non-statisticians and non-mathematicians usually understand the independence of events A and B to mean they are completely unrelated. For example, when flipping a coin two times, people could say that the two outcomes aren’t independent because the same coin is used both times.

In contrast, a statistician would argue that the outcome of one flip doesn’t affect the realization of the other. So they would consider the two events statistically independent.

8. Conclusion

In this article, we explained statistical independence or events and random variables.