Baeldung Pro – Scala – NPI EA (cat = Baeldung on Scala)
announcement - icon

Learn through the super-clean Baeldung Pro experience:

>> Membership and Baeldung Pro.

No ads, dark-mode and 6 months free of IntelliJ Idea Ultimate to start with.

1. Introduction

String manipulation is a common task in programming, and one of the recurring challenges is dealing with inconsistent whitespace.

In this tutorial, we’ll look at various methods to convert multiple spaces into a single space in Scala. Additionally, we’ll remove leading and trailing spaces from the string while collapsing consecutive spaces.

2. Using replaceAll()

The most straightforward method is to use a regular expression with the replaceAll() function to handle this task. Let’s look at the implementation:

def usingReplaceAll(str: String): String = {
  str.trim.replaceAll("\\s+", " ")
}

Here, we used the regular expression \\s+ to match multiple spaces and replace them with a single space character. We also use the trim() function to remove the leading and trailing spaces.

3. Using split()

Another approach is to split the string into individual words using the same regular expression for multiple spaces and combine them into a single string:

def usingSplit(str: String): String = {
  str.trim.split("\\s+").mkString(" ")
}

Here, we split the string using the regular expression \\s+ and combined the words using the mkString() function.

4. Using zip()

To check for and remove multiple spaces, we can compare adjacent characters to determine if both are spaces. Based on this comparison, we can construct a string without multiple spaces. In Scala, we can achieve this by using the zip() function on the string with its tail and comparing characters pairwise:

def usingZip(str: String): String = {
  if (str.trim.isEmpty) {
    str.trim
  } else {
    val zipped = str.trim.zip(str.trim.tail)
    str.trim.head + zipped.collect {
      case (a, b) if !(a == ' ' && b == ' ') => b
    }.mkString
  }
}

While this method works, it may not be the most performant for large strings due to the additional overhead of creating pairs and performing comparisons.

5. Using StringBuilder

Another effective approach for collapsing multiple spaces into one is using StringBuilder. This method involves iterating through the string and constructing the result efficiently with StringBuilder. Let’s look at the implementation:

def usingStringBuilder(str: String): String = {
    val sb = new StringBuilder
    var lastCharWasSpace = false
    for (c <- str.trim) {
      if (c != ' ' || !lastCharWasSpace) sb.append(c)
      lastCharWasSpace = c == ' '
    }
    sb.toString
}

By maintaining a flag to track whether the previous character was a space, we can ensure that only the first space in any sequence of consecutive spaces is added to the StringBuilder. This approach minimizes memory overhead and avoids the creation of intermediate strings, making it particularly suitable for handling large strings or performance-critical applications.

The downside of this approach is that it is generally not recommended in functional programming, as it relies on mutable data structures like StringBuilder and a var to track state rather than embracing Scala’s functional programming idioms. Nonetheless, this method can still be useful in cases where performance optimizations or simplicity are prioritized over strict adherence to functional paradigms.

6. Conclusion

In this article, we explored different methods for collapsing multiple spaces into a single space in Scala. The replaceAll() method is simple and readable but may be less efficient for large strings due to regex overhead. The split() method, which separates the string into words and joins them with a single space, offers a functional approach but can involve extra processing steps. The zip() method provides a functional solution by comparing adjacent characters, though it may not be optimal for large datasets. The StringBuilder method is highly efficient for performance-critical scenarios, as it minimizes memory usage and avoids intermediate strings. Depending on the situation, we can choose the best implementation.

The code backing this article is available on GitHub. Once you're logged in as a Baeldung Pro Member, start learning and coding on the project.