1. Overview

In this quick tutorial, we’re going to get familiar with a couple of ways to split Strings into a sequence of elements in Kotlin.

2. Split by Delimiters

To split a String with just one delimiter, we can use the split(delimiter: String) extension function:

val info = "Name,Year,Location"
assertThat(info.split(",")).containsExactly("Name", "Year", "Location")

Here, the split() function splits the given String around the comma separator as expected. Quite interestingly, we can even pass multiple delimiters to the same function as well:

val info = "Name,Year,Location/Time"
assertThat(info.split(",", "/")).containsExactly("Name", "Year", "Location", "Time")

In the above example, we’re using the comma and slash characters as delimiters at the same time. It’s also possible to limit the number of split parts:

val info = "Name,Year,Location/Time/Date"
assertThat(info.split(",", limit = 2)).containsExactly("Name", "Year,Location/Time/Date")
assertThat(info.split(",", "/", limit = 4)).containsExactly("Name", "Year", "Location", "Time/Date")

When the limit argument is, say, 2, then the returned list would have, at most, 2 elements in it.

Moreover, it’s possible to split around a delimiter in a case-insensitive manner:

val info = ""
assertThat(info.split("aa", ignoreCase = true)).containsExactly("", "Firefox", "58")

In addition to the split() function, we can use the special lines() extension function to split with the new line as the delimiter:

val info = "First line\nsecond line\rthird"
assertThat(info.lines()).containsExactly("First line", "second line", "third")

The lines() function uses three characters or character sequences as its delimiters: \n, \r, and \r\n.

2.1. Lazy Split

All variations of the split() extension function return a List<String> — an eagerly evaluated collection of parts. As opposed to simple split(), there is a splitToSequence() variation that returns a Sequence<String>, which is a lazily-evaluated collection:

val info = "random_text,".repeat(1000)

Here, the splitToSequence() won’t create a collection with 1,000 elements. Instead, it returns some sort of an iterator, so the computation of each part will be deferred until we ask for it. In the above example, only the first split part is computed.

Just like other sequences, if we perform a chain of operations on the returned Sequence<String>, we won’t get an intermediate result at the end of each step. This can be potentially beneficial, especially when the number of split parts and operations is quite high.

3. Split by Regex

In addition to literal characters, we can also use regular expressions as the delimiter. For instance:

val info = "28 + 32 * 2 / 64 = 29"
val regex = "\\D+".toRegex()
assertThat(info.split(regex)).containsExactly("28", "32", "2", "64", "29")

In the above example, we’re using any sequence of non-digits as the delimiter. Quite similarly, we can pass Java’s Pattern as the regex, as well:

val pattern = Pattern.compile("\\D+")
assertThat(info.split(pattern)).containsExactly("28", "32", "2", "64", "29")

It’s also possible to limit the number of parts:

assertThat(info.split(regex, 3)).containsExactly("28", "32", "2 / 64 = 29")
assertThat(info.split(pattern, 3)).containsExactly("28", "32", "2 / 64 = 29")

4. Conclusion

In this short tutorial, we saw how we could split a Kotlin String using literal characters and regular expressions as delimiters. Moreover, we also learned to compute split parts in a lazy fashion.

As usual, all the examples are available over on GitHub.

Comments are closed on this article!