1. Overview

In this quick tutorial, we’re going to get familiar with a couple of ways to split Strings into a sequence of elements in Kotlin.

2. Split by Delimiters

To split a String with just one delimiter, we can use the split(delimiter: String) extension function:

val info = "Name,Year,Location"
assertThat(info.split(",")).containsExactly("Name", "Year", "Location")

Here, the split() function splits the given String around the comma separator as expected. Quite interestingly, we can even pass multiple delimiters to the same function as well:

val info = "Name,Year,Location/Time"
assertThat(info.split(",", "/")).containsExactly("Name", "Year", "Location", "Time")

In the above example, we’re using the comma and slash characters as delimiters at the same time. It’s also possible to limit the number of split parts:

val info = "Name,Year,Location/Time/Date"
assertThat(info.split(",", limit = 2)).containsExactly("Name", "Year,Location/Time/Date")
assertThat(info.split(",", "/", limit = 4)).containsExactly("Name", "Year", "Location", "Time/Date")

When the limit argument is, say, 2, then the returned list would have, at most, 2 elements in it.

Moreover, it’s possible to split around a delimiter in a case-insensitive manner:

val info = "127.0.0.1aaFirefoxAA58"
assertThat(info.split("aa", ignoreCase = true)).containsExactly("127.0.0.1", "Firefox", "58")

In addition to the split() function, we can use the special lines() extension function to split with the new line as the delimiter:

val info = "First line\nsecond line\rthird"
assertThat(info.lines()).containsExactly("First line", "second line", "third")

The lines() function uses three characters or character sequences as its delimiters: \n, \r, and \r\n.

2.1. Lazy Split

All variations of the split() extension function return a List<String> — an eagerly evaluated collection of parts. As opposed to simple split(), there is a splitToSequence() variation that returns a Sequence<String>, which is a lazily-evaluated collection:

val info = "random_text,".repeat(1000)
assertThat(info.splitToSequence(",").first()).isEqualTo("random_text")

Here, the splitToSequence() won’t create a collection with 1,000 elements. Instead, it returns some sort of an iterator, so the computation of each part will be deferred until we ask for it. In the above example, only the first split part is computed.

Just like other sequences, if we perform a chain of operations on the returned Sequence<String>, we won’t get an intermediate result at the end of each step. This can be potentially beneficial, especially when the number of split parts and operations is quite high.

3. Split by Regex

In addition to literal characters, we can also use regular expressions as the delimiter. For instance:

val info = "28 + 32 * 2 / 64 = 29"
val regex = "\\D+".toRegex()
assertThat(info.split(regex)).containsExactly("28", "32", "2", "64", "29")

In the above example, we’re using any sequence of non-digits as the delimiter. Quite similarly, we can pass Java’s Pattern as the regex, as well:

val pattern = Pattern.compile("\\D+")
assertThat(info.split(pattern)).containsExactly("28", "32", "2", "64", "29")

It’s also possible to limit the number of parts:

assertThat(info.split(regex, 3)).containsExactly("28", "32", "2 / 64 = 29")
assertThat(info.split(pattern, 3)).containsExactly("28", "32", "2 / 64 = 29")

A common mistake for those transitioning from the Java world to Kotlin is passing a Regex as a String to the split() function. For instance, consider the scenario where we need to split an input string by consecutive whitespace characters:

val info = "a b    c      d"

In Java, this line does the job:

// Java code:
info.split("\\s+"); // result in a string array: "a", "b", "c", "d"

However, it’s worth noting that when we want to perform split() by Regex in Kotlin, unlike Java, we must pass a Regex object to Kotlin’s split() function:

//split by literal regex string won't work in Kotlin:
assertThat(info.split("\\s+")).containsExactly(info)

// a Regex object is required
assertThat(info.split(Regex("\\s+"))).containsExactly("a", "b", "c", "d")
assertThat(info.split("\\s+".toRegex())).containsExactly("a", "b", "c", "d")

4. Split Into an Array

When we use the split() function to split a Kotlin string, it gives us the result as an ArrayList object:

val fruits = "apple,banana,grapes,orange"
val fruitsArrayList = fruits.split(",")
assertEquals("ArrayList", fruitsArrayList::class.simpleName)

So, if we want the split values into an Array, then we need to transform it using the toTypedArray() function:

val fruitsArray = fruitsArrayList.toTypedArray()
assertEquals("Array", fruitsArray::class.simpleName)

As expected, we can notice that the type of fruitsArray object is an Array. Furthermore, we can validate the contents of the resultant fruitsArray:

assertEquals(4, fruitsArray.size)
assertEquals("apple", fruitsArray[0])
assertEquals("banana", fruitsArray[1])
assertEquals("grapes", fruitsArray[2])
assertEquals("orange", fruitsArray[3])

It looks like we’ve got this one right.

Alternatively, we could also use the Array() constructor to construct the array using the values from the fruitsArrayList ArrayList:

val fruitsArray = Array(fruitsArrayList.size) { i -> fruitsArrayList[i] }
assertEquals(4, fruitsArray.size)
assertEquals("apple", fruitsArray[0])
assertEquals("banana", fruitsArray[1])
assertEquals("grapes", fruitsArray[2])
assertEquals("orange", fruitsArray[3])

Perfect! The result meets our expectations.

5. Conclusion

In this short tutorial, we saw how we could split a Kotlin String using literal characters and regular expressions as delimiters. Moreover, we also explored the use cases of splitting a string into an array and lazily computing the split parts.

As usual, all the examples are available over on GitHub.

Comments are closed on this article!