1. Overview

In this tutorial, we’ll learn how to remove duplicate characters from a String in Scala using the standard library.

2. Iterating Through the String

The most naive approach would be to iterate through all characters of the String and check if we have seen it already:

scala> val s = "abcb"
s: String = abcb

scala> val sb = new StringBuilder()
sb: StringBuilder =

scala> s.foreach { case char =>
     |   if (!sb.toString.contains(char)) {
     |       sb.append(char)
     |   }
     | }

scala> sb.toString
res5: String = abc

In this example, we use a StringBuilder to store the characters we have seen previously. We could use slightly different approaches while keeping the same idea. For instance, we can use the String.indexOf() method to discover if the character exists further ahead on the String, as we might do when removing duplicated characters in Java.

3. Using distinct()

Another possible approach that we’ll look at is the String.distinct() method:

scala> val s = "abcb"
s: String = abcb

scala> s.distinct
res0: String = abc

Using an existing method in the standard library requires much less effort than the previous one.

4. Using a Set

If we don’t care about the order of the characters, we can convert our String into a Set, which by default doesn’t contain duplicates:

scala> val s = "aabbccddeeff"
s: String = aabbccddeeff

scala> s.toSet
res0: scala.collection.immutable.Set[Char] = Set(e, f, a, b, c, d)

scala> s.toSet.mkString
res1: String = efabcd

But we can also keep the original order if we make use of a sorted Set like LinkedHashSet:

scala> import scala.collection.mutable.LinkedHashSet
import scala.collection.mutable.LinkedHashSet

scala> val sortedSet = LinkedHashSet[Char]()
sortedSet: scala.collection.mutable.LinkedHashSet[Char] = Set()

scala> (sortedSet ++= s.toList).mkString
res0: String = abcdef

This approach ensures we keep the original order while removing the duplicates.

5. Conclusion

In this article, we’ve learned how to easily remove the duplicated characters of a Scala String by using the standard library.

We discussed the naive approach by iterating through each character and looking for more occurrences. Then, we used the String.distinct() method, and finally, we saw how to use Set to achieve the same result.

Comments are open for 30 days after publishing a post. For any issues past this date, use the Contact form on the site.