Sampling From the Exponential Distribution | Baeldung on Computer Science

1. Introduction

In this tutorial, we’ll implement a generator of pseudorandom numbers that follow an exponential distribution.

2. Exponential Distribution

Exponential distributions are useful for modeling the time passed until an event occurs. For instance, the duration of a call is distributed exponentially: short calls are frequent, whereas long ones are rare. Other phenomena also follow an exponential distribution. For instance, that’s the case with the amount of money customers spend on a trip to a market. Here’s an example of what a thousand customers typically spend in a supermarket:

As we see, many customers spend USD 40 or less, but very few spend over USD 1000.

2.1. Math

An exponential distribution has a parameter $\boldsymbol{\lambda > 0}$ . It quantifies the speed at which the occurrence probabilities of values decrease. The distribution’s probability density function (PDF) is:

(1) $\begin{equation*} f_{\lambda}(x) = \lambda e^{-\lambda x} \end{equation*}$

and its cumulative density function (CDF) is:

(2) $\begin{equation*} F_{\lambda}(x) = 1 - e^{-\lambda x} \end{equation*}$

The formulae show that the decrease speed (also known as decay) is exponential, hence the name.

3. The Inversion Method

The inversion method uses a property of CDFs: regarded as random variables, they follow the uniform distribution over $[0, 1]$ : $\mathrm{U}[0, 1]$ . What does that mean? Well, if we sample a lot of numbers $x_1, x_2, \ldots, x_n$ from an exponential distribution and draw a histogram of the corresponding CDFs, we’ll see a uniform PDF.

But, the converse is also true. If we sample $\boldsymbol{n}$ uniform values $\boldsymbol{u_1, u_2, \ldots, u_n}$ from $\boldsymbol{\mathrm{U}[0, 1]}$ and calculate the inverses $\boldsymbol{F_{\lambda}^{-1}(u_1), F_{\lambda}^{-1}(u_2), \ldots, F_{\lambda}^{-1}(u_n)}$ , we’ll get numbers from the exponential distribution with the decay parameter $\lambda$ , $\mathrm{Exp}[\lambda]$ .

This method has a straightforward geometric interpretation. As the first step, we place the $u_i$ values on the $y$ -axis. Then, we find the points of the $F_{\lambda}$ line whose $y$ -values are equal to the drawn $u_i$ . The found points’ $x$ -values are the exponential numbers we wanted to draw:

To use the method in programs, we need the inverse $F_{\lambda}^{-1}$ of the exponential CDF. We can derive it analytically:

(3) $\begin{equation*} F_{\lambda}^{-1}(u) = -\frac{1}{\lambda} \ln(1-u) \end{equation*}$

3.1. Pseudocode

Here’s the pseudocode of the inversion method:

algorithm InversionMethod(λ, n):
    // INPUT
    //    λ = the decay parameter of the target exponential distribution
    //    n = how many numbers to draw from it
    // OUTPUT
    //    sample = an array of n numbers from the exponential distribution with the decay λ

    sample <- initialize an empty array

    for i <- 1 to n:
        u <- draw a random number from U[0, 1]
        x <- - (1 / λ) * ln(1 - u)
        append x to sample

    return sample

All we need is $\boldsymbol{n}$ numbers from $\boldsymbol{\mathrm{U}[0, 1]}$ , and the inversion method can transform them to follow $\boldsymbol{\mathrm{Exp}[\lambda]}$ for any $\boldsymbol{\lambda}$ .

Since $u$ comes from $\mathrm{U}[0, 1]$ , the difference $1-u$ also follows $\mathrm{U}[0, 1]$ . So, some implementations skip the subtraction step:

(4) $\begin{equation*} x_i = -\frac{1}{\lambda} \ln(u_i) \end{equation*}$

3.2. Example

Let’s say we want ten numbers from $\mathrm{Exp}[1]$ .

First, we draw ten values from $U[0, 1]$ . For instance, let them be $0.09, 0.2, 0.3, 0.42, 0.55, 0.61, 0.7, 0.8, 0.9, 0.99$ .

Then, we calculate the inverses:

$\begin{equation*} \begin{aligned} x_1 &= - \ln(1 - 0.09) &= 0.09 \\ x_2 &= - \ln(1 - 0.2) &= 0.22 \\ x_3 &= - \ln(1 - 0.3) &= 0.36 \\ x_4 &= - \ln(1 - 0.42) &= 0.54 \\ x_5 &= - \ln(1 - 0.55) &= 0.8 \\ x_6 &= - \ln(1 - 0.61) &= 0.94 \\ x_7 &= - \ln(1 - 0.7) &= 1.2 \\ x_8 &= - \ln(1 - 0.8) &= 1.61 \\ x_9 &= - \ln(1 - 0.9) &= 2.3 \\ x_{10} &= - \ln(1 - 0.99) &= 4.61 \\ \end{aligned} \end{equation*}$

That way, we get ten numbers from $\mathrm{Exp}[1]$ :

Example: Pseudorandom number generation via the inversion method

4. Rejection Sampling

To use the inversion method, we need the inverse $\boldsymbol{F_{\lambda}^{-1}}$ . However, many distributions don’t have an invertible CDF, e.g., the normal distributions. Further, even if we can derive the inverse analytically, it may be too complex to compute efficiently, or we may not have an implementation in our software.

In such cases, we can apply rejection sampling. It uses the PDF plot of the target distribution and selects points from it randomly. If a selected point is below the PDF line, the method keeps its $x$ -coordinate. Otherwise, it rejects it. We repeat this step until collecting the desired number of samples.

Here’s the visual representation of rejection sampling for $\mathrm{Exp}[\lambda]$ :

Red crosses represent the rejected points, whereas blue circles denote those we kept. As we see, there are more accepted $x$ -values under the regions with higher overall densities.

Why does rejection sampling work? Since the PDF has high values in the high-density regions, they have a greater chance of capturing a randomly drawn point.

4.1. Pseudocode

Here’s the pseudocode of rejection sampling adapted to the exponential distribution $\mathrm{Exp}[\lambda]$ :

algorithm RejectionSampling(λ, n, r):
    // INPUT
    //    λ = the decay parameter of the target exponential distribution
    //    n = how many numbers to draw from it
    //    r = the upper bound for x-coordinates
    // OUTPUT
    //    sample = an array of n numbers from the exponential distribution with the decay λ

    sample <- an empty array
    m <- 0

    while m < n:
        x <- draw a random number from [0, r]
        y <- draw a random number from [0, λ]

        if y <= f_λ(x):
            append x to sample

    return sample

To select a random point in the plot according to a 2D uniform distribution, we can draw the $x$ and $y$ coordinates independently.

Since the support of $\mathrm{Exp}[\cdot]$ is $[0, \infty]$ , we can propose any non-negative value as $x$ . However, using the maximal positive number as the bound would result in many rejected values and poor performance. So, we restrict sampling of the $\boldsymbol{x}$ -coordinates to $\boldsymbol{[0, r]}$ , where $r$ is a suitable upper bound.

How do we choose $r$ ? We can set it to a value that very rarely occurs. For instance, $99.99\%$ of $\mathrm{Exp}[\lambda]$ is to the left of $-\frac{\ln(0.0001)}{\lambda}$ . For $\lambda=1$ , that is approximately 9.21.

Similarly, we sample $\boldsymbol{y}$ from $\boldsymbol{\mathrm{U}[0, \max{f_{\lambda}}]}$ . That way, we speed up sampling because we won’t get points that are certainly above the PDF of $\mathrm{Exp}[\lambda]$ and have a zero chance of being kept.

Since the maximum density of $Exp[\lambda]$ is at $x=0$ and is equal to $\lambda$ , we sample $y$ -coordinates from $\mathrm{U}[0, \lambda]$ .

4.2. Example

The first several iterations of rejection sampling for generating numbers from $\mathrm{Exp}[1]$ could go like this:

$point$	$x$	$y$	$f_1(x)$	$decision$
1	1.17	0.73	0.31	reject
2	4.38	0.11	0.01	reject
3	2.61	0.47	0.07	reject
4	1.72	0.98	0.18	reject
5	8.68	0.64	0.0	reject
6	4.79	0.01	0.01	keep
7	0.65	0.67	0.52	reject
8	0.38	0.43	0.68	keep
9	2.68	0.92	0.07	reject
$\ldots$

Here, we can see a potential problem rejection sampling can face. It can generate many more points than we need. That can happen if the algorithm draws points mostly above the PDF line.

4.3. Optimization

One way to address this issue is to set $r$ to a smaller value. For instance, we can use the $99\%$ -quantile or $95\%$ -quantile of the target $\mathrm{Exp}[\lambda]$ . However, trimming the distribution too much skews the generated numbers to the left. If that happened, we wouldn’t sample from $\mathrm{Exp}[\lambda]$ , but from a similar yet different distribution.

We can also play with drawing points from the plot. Here, we used a 2D uniform distribution over $[0, r] \times [0, \lambda]$ to sample them, but any distribution would have worked. The only question is how to choose the one that best suits our target distribution.

5. Conclusion

In this article, we talked about sampling from an exponential distribution. We presented two techniques: the inversion method and rejection sampling.

The inversion method uses the inverse of the exponential CDF. Rejection sampling doesn’t require the inverse but can take longer.

Core Concepts

Operating Systems

Neural Networks

Graph Theory

Latex

Full Archive

About Baeldung