1. Introduction

When working with numbers presented as a String, there’s often a need to convert them to numerical values for subsequent calculations. This becomes complicated when strings representing larger numerical values have separators like commas (“,”) or periods (“.”) for the thousands position. Kotlin offers several methodologies for parsing these strings into numerical values.

In this tutorial, we’ll explore several approaches for parsing these strings into numerical values.

2. The DecimalFormatSymbols Class

The DecimalFormatSymbols class plays a crucial role in our parsing techniques. It allows us to obtain locale-specific formatting symbols, such as the thousands separator. We can get this symbol with the groupingSeparator:

DecimalFormatSymbols.getInstance(Locale.getDefault()).groupingSeparator

We’ll utilize this to retrieve the grouping separator character specific to the provided locale.

Throughout this tutorial, we’ll look at two locales: Locale.US, which separates the thousands position with a comma, and Locale.GERMAN, which separates the thousands position with a period.

3. Using replace() With Regex

A straightforward approach to parsing a string with the thousands separator involves using the replace() method. This method removes the locale-specific thousands separator from the string specified with a regular expression:

fun parseStringUsingReplace(input: String, locale: Locale): Int { 
    val separator = DecimalFormatSymbols.getInstance(locale).groupingSeparator

    return input.replace(Regex("[$separator]"), "").toInt() 
}

Using regex, we replace all instances of the grouping separator with an empty string before converting the number to an Int.

Now, we need to unit test our helper method for correctness:

@Test
fun `parses string with thousands separator using replace method`(){
    val result1 = parseStringUsingReplace("1,000", Locale.US)
    val result2 = parseStringUsingReplace("25.750", Locale.German)

    assertEquals(1000, result1)
    assertEquals(25750, result2)
}

In this test, we parse numbers formatted with two different locales, one that uses a comma for the thousands separator and another that uses a period.

4. Using the StringTokenizer Class

Another strategy is to use the StringTokenizer. We can split the string into tokens based on a specified delimiter. In our case, we’re going to use the Locale-specific groupingSeparator again:

fun parseStringUsingTokenizer(input: String, locale: Locale): Int {
    val separator = DecimalFormatSymbols.getInstance(locale).groupingSeparator
    val tokenizer = StringTokenizer(input, separator.toString())
    val builder = StringBuilder()
    while (tokenizer.hasMoreTokens()) {
        builder.append(tokenizer.nextToken())
    }
    return builder.toString().toInt()
}

First, we create an instance of StringTokenizer with the separator character as delimiter. Then, we loop through each token in the string and append it to a StringBuilder. Finally, we convert the number to an Int.

Again, we test this by parsing numbers from two different Locales:

@Test
fun `parses string with thousands separator using string tokenizer`(){
    val result1 = parseStringUsingTokenizer("1,000", Locale.US)
    val result2 = parseStringUsingTokenizer("25.750", Locale.German)

    assertEquals(1000, result1)
    assertEquals(25750, result2)
}

5. Using NumberFormat

Finally, we can use the NumberFormat class from the Java standard library. This class allows us to parse() numbers directly with a Locale:

fun parseStringWithSeparatorUsingNumberFormat(input: String, locale: Locale): Int {
    val number = NumberFormat.getInstance(locale)
    val num = number.parse(input)
    return num.toInt()
}

As usual, let’s test our helper methods for correctness:

@Test
fun `parses string with thousands separator using number format class`(){
    val result1 = parseStringWithSeparatorUsingNumberFormat("1,000", Locale.US)
    val result2 = parseStringWithSeparatorUsingNumberFormat("25.750", Locale.German)

    assertEquals(1000, result1)
    assertEquals(25750, result2)
}

Specifically, providing the right Locale to the NumberFormat class automatically accounts for the correct thousands separator to parse the number.

6. Conclusion

In this article, we’ve explored various approaches to parsing strings with a thousands-position separator in Kotlin.

First, we discovered how to determine the locale-specific grouping separator. Then, we explored the replace() method and regex. Additionally, we examined the StringTokenizer class to split the string into tokens based on specified delimiters. Finally, we delved into the NumberFormat class from the Java standard library.

As always, the complete source code used in this article is available over on GitHub.

Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments