What’s the P99 Latency? | Baeldung on Computer Science

1. Introduction

In this tutorial, we’ll study percentiles and their application to network latency and application response time.

2. Percentiles

A percentile of a distribution or a sample is a value greater than the given percentage of observations in the same group.

For example, if we say that Sam’s GRE score lies in the $95^{th}$ percentile, then we want to say that Sam has performed better than $95\%$ of all GRE test givers.

So, we use percentiles to express how a given value compares to others in the same set. The following figure illustrates $95^{th}$ and $99^{th}$ percentile for a normal distribution of a test score with mean value $\mu =720$ and standard deviation $\sigma = 30$ :

As evident in the figure, the $95^{th}$ percentile value is 769.3, and the $99^{th}$ percentile value is 789.8.

More formally, the $\mathbf{k^{th}}$ percentile of a distribution ( $\mathbf{k=1,2,\ldots,100}$ ) is greater than or equal to $\mathbf{100}$ % of values in the same distribution. We can calculate the percentile of a finite sample, in which case we’re talking about the sample estimates, not distributional percentiles.

2.1. Percentile vs. Percentage

Percentile is not the same as percentage.

The key difference is that a percentage is a part of a unit, whereas a percentile is a value that a specific percentage of other values from the same group is smaller from.

3. Latency Percentiles

We define latency as the total time it takes for a data packet to travel from its origin to its destination. We usually refer to latency in the context of a network. Network latency is one of the key values we use to define the quality of service that a provider offers to a customer. We usually measure latency in milliseconds.

The lower the latency, the better the user experience.

3.1. P99 Latency

We usually describe latency with its 99th percentile or P99.

So, if we say that our HTTP-based web application has a P99 latency of less than or equal to 2 milliseconds, then we mean that $\mathbf{99}$ % of web calls are serviced with a response under 2 milliseconds. Conversely, only $1\%$ of calls get a delayed response of over 2 milliseconds.

3.2. Why P99 Latency?

People often use basic statistical measures such as the minimum, mean, median, or maximum value to describe a data set. The problem with these statistics is that they may be bad at describing data. The mean and median often mask outliers, whereas minimum or maximum may give us an outlier.

Network administrators aim to optimize the $\mathbf{99^{th}}$ percentile of network latency to improve the overall response time with peak load. They also use percentile-based alerts for monitoring as these don’t have high false-positive rates. Also, such alerts are a lot less volatile and thus depict important performance degradation events.

So, the P99 latency is greater than almost all latencies, so optimizing it is like maximizing the performance in the worst case. However, since P99 isn’t the maximal value, we expect it not to be influenced by outliers.

4. Percentile Calculation

Let’s say we have a dataset $d$ of $n$ latency records.

To calculate the P99 latency, we first sort the dataset non-decreasingly. Let the rank of a latency value be by its index in the sorted array. The rank of the $p^{th}$ percentile is then:

$Rank_{p} \; = \; ceil(\frac{p}{100} * n)$

Now, we get the $p^{th}$ percentile as:

$Percentile_{p} \; = \; d[Rank_{p}]$

4.1. Example

Let’s apply these formulae. Let’s say we have the quiz marks of 15 students in a class:

To compute their $90^{th}$ percentile, we first sort them:

Next, we find the percentile rank of P90:

$Rank_{90} \; = \; ceil(\frac{90}{100} * 15) \; = \; ceil(13.5) \; = 14$

Finally, we get the $90^{th}$ percentile:

$Percentile_{90} \; = \; d[Rank_{90}] \; = \; d[14] \;=\; 98$

So, for this dataset, we can say that $90\%$ of students got marks less than or equal to 98.

5. Percentiles and Confidence Intervals

If we take a sample of $n$ elements from a dataset having $N$ ( $N$ is much larger than $n$ ), the sample’s $99^{th}$ percentile may differ from that of the entire dataset. Here, the larger the sample, the more precise our estimate will be.

This is similar to the case with latency. When dealing with it, we record latencies and assume they follow the same but unknown distribution. Then, we treat the records as a sample. To account for possible fluctuations of the sample estimate around the actual P99 when generalizing the results to all future latencies, we can construct confidence intervals (CI).

A CI of P99 is the range of values that contain the distributional P99 with predefined confidence. For instance, if $[a, b]$ is the CI with the confidence of 80%, that means that 80% of the CIs we construct in the same way as $[a, b]$ (but by using other samples) will contain the distributional P99.

We can use order statistics to construct a CI for P99. In order statistics, we arrange all the sample values in ascending order and then do our analysis. We employ order statistics in applications such as process simulation, network modeling, actuarial products, and optimizing production processes.

5.1. Order Statistics

Let our sorted set of latencies be $d_1 \leq d_2 \leq \ldots d_n$ , and let $d_r$ be the set’s $99^{th}$ percentile. Let $q$ be the desired level of confidence. Since P99 follows the binomial distribution with parameters $n$ and 0.99, we find the ranks $i < r$ and $j > r$ such that:

$Probability(d_i \leq P99 \leq d_j) = \sum_{k=i}{j} \binom{n}{k} 0.99^i \times 0.01^{1-i} \geq q$

Then, we can say that the actual latency is between $d_i$ and $d_j$ with the confidence of $q$ . With a large $n$ , we can use the normal approximation to calculate the CI more easily. However, the more recorded latencies, the narrower the CI. So, if we have a lot of records, we can skip this step and report the sample estimate instead of the interval.

6. Conclusion

In this article, we have gone through percentiles, latency, and their relationship. Percentiles give us a more realistic and intuitive understanding of the actual performance characteristics of our network or application.

We use the $\mathbf{99^{th}}$ percentile to monitor and improve the overall network latency or our application response time. Percentiles help us distinguish between outliers and real effects.

Full Archive

About Baeldung

Core Concepts

Operating Systems

Artificial Intelligence

Graph Theory

Latex