1. Introduction

In programming, calculating standard deviation is a crucial statistical task that comes in handy when we need to determine the amount of variation or dispersion from the mean value in a dataset.

There are several ways we can calculate the standard deviation of a series of numbers represented as arrays in Kotlin. In this tutorial, we’ll investigate a few of these approaches.

2. Definition and Equation of Standard Deviation

As mentioned earlier, standard deviation is a measure of the amount of variation or dispersion in a data set. We can calculate standard deviation by finding the square root of the variance of the data under consideration. The variance is the average of the squared differences between each data point and the mean of the data under consideration.

Mathematically, the equation for variance is:

variance = Σ(x - μ)² / N

Where Σ represents the summation of each data point x minus the mean of the data μ, squared. N represents the number of data points.

Therefore, knowing how to calculate the variance, we obtain the standard deviation mathematically by taking the square root of the variance:

standard deviation = √variance

3. Using Kotlin’s Standard Library

The first approach requires the use of Kotlin’s math package included in the standard library. This package comes with a lot of methods for accomplishing different mathematical tasks, including standard deviation.

Specifically, we can use the sqrt() method to compute square roots, pow() to handle exponents, and average() to compute the mean of a list of numbers:

fun standardDeviationUsingMathPackage(numbers: DoubleArray): Double {
    val mean = numbers.average()
    val variance = numbers.map { (it - mean).pow(2) }.average()
    return Math.sqrt(variance)
}

In the method above, we first calculate the mean of the dataset using the average() method. Next, we calculate the variance of the dataset by subtracting the mean from each number in the dataset, squaring the result, and then obtaining the average of the squared differences as the variance of the dataset. Finally, we return the square root of the variance to get the standard deviation.

Now, to make sure our method works correctly, it’s a good idea to unit-test it:

@Test
fun `standard deviation using the math package`() {
    val dataset1 = doubleArrayOf(1.0, 2.0, 3.0, 4.0, 5.0, 6.0)
    val dataset2 = doubleArrayOf(11.0, 14.0, 19.0, 23.0, 28.0, 30.0)

    assertEquals(1.707825127659933, standardDeviationUsingMathPackage(dataset1))
    assertEquals(6.914156170897181, standardDeviationUsingMathPackage(dataset2))
}

4. Using the Apache Commons Math Library

Another approach we can use to achieve our goal is by leveraging a third-party library that provides sophisticated statistical methods. The Apache Commons Math library is a popular choice for this purpose and provides several methods for calculating standard deviation.

4.1. Maven and Gradle Configuration

To use this library in our Maven Project, we need to include the dependency in the project’s pom.xml file:

<dependency>
    <groupId>org.apache.commons</groupId>
    <artifactId>commons-math3</artifactId>
    <version>3.6.1</version>
</dependency>

For a Gradle project, all we need to do is add the following line in our project’s build.gradle file:

implementation 'org.apache.commons:commons-math3:3.6.1'

4.2. Implementation

Now, let’s use this library to compute the standard deviation of a dataset:

fun calculateStandardDeviationUsingApacheCommonsMath(dataset: DoubleArray): Double {
    val sd = StandardDeviation(false)
    return sd.evaluate(dataset)
}

In this method, we instantiate the StandardDeviation instance with a false parameter. This parameter allows us to specify whether the data under consideration is a sample or the entire population.

If set to true, this library will consider the dataset as a sample and apply a slightly different formula while calculating the standard deviation. However, if we set this parameter to false, the library will consider the entire population and calculate the standard deviation using the formula that divides by n.

Finally, we use the evaluate() method on the StandardDeviation instance while passing in the dataset as a parameter. This will return the standard deviation of the dataset.

To ensure the correctness of our implementation, we should test our function:

@Test
fun `standard deviation using the apache commons math library`() {
    val dataset1 = doubleArrayOf(1.0, 2.0, 3.0, 4.0, 5.0, 6.0)
    val dataset2 = doubleArrayOf(11.0, 14.0, 19.0, 23.0, 28.0, 30.0)

    assertEquals(1.707825127659933, calculateStandardDeviationUsingApacheCommonsMath(dataset1))
    assertEquals(6.914156170897181, calculateStandardDeviationUsingApacheCommonsMath(dataset2))
}

5. Conclusion

In this article, we’ve explored two major approaches to calculating the standard deviation of a dataset. The first method involves the use of Kotlin’s math package included in the standard library and is readily available to use in our Kotlin programs.

Conversely, the second approach involves the use of a third-party library, the Apache Commons Math library. Although this library provides us with advanced statistical methods, it requires us first to configure it for our projects before we can use it.

As always, the code samples and relevant test cases related to this article can be found over on GitHub.
Comments are open for 30 days after publishing a post. For any issues past this date, use the Contact form on the site.