Course – LS – All

Get started with Spring and Spring Boot, through the Learn Spring course:

>> CHECK OUT THE COURSE

1. Overview

When working with a string collection, concatenating these strings with specific separators is a common task. Fortunately, various solutions are at our disposal, including using String.join() and Collectors.joining().

In this quick tutorial, we’ll explore an interesting string concatenation problem: joining strings in a more natural language-like manner.

2. Introduction to the Problem

Let’s understand the problem with an example. Let’s say we have a list of strings {“A”, “B”, “C”, “D”}. If we want to join them with commas as the separator, the result would be “A, B, C, D“. So far, so good.

However, if we want the joined result to follow English grammar, the expected outcome should be “A, B, C and D” or “A, B, C, and D“. We’ll see why there are two variants later. But, at least we understand that the result isn’t something we can obtain directly from a String.join() or Collectors.joining() method call.

The comma between “C” and “and” in the example above is called Oxford comma or Harvard comma. There are discussions about which style is more precise. But this isn’t our focus. We aim to create a method to support both scenarios.

So, given a list with more than two string elements, for instance, {“A”, “B”, “C”, … “X”, “Y”}, we may have two results depending on the requirement:

  • With Oxford comma – A, B, C, …. X and Y
  • Without Oxford comma- A, B, C, … X, and Y

Moreover, we have only discussed a list with at least three element cases. The result can be different if the list holds less than three elements:

  • For an empty list, return an empty string, so, { } becomes “”
  • For a list with a single element, return that element. For example, {“A”} becomes “A”
  • When dealing with a list containing two string elements, combine them with the word “and” without using a comma. For instance, {“A”, “B”} becomes “A and B”

Next, let’s create a method to join a list of strings in a natural language-like manner. For simplicity, we assume the input list isn’t null and doesn’t contain null or empty string elements. In practice, if the list carries empty or null strings, we can filter out those elements first.

3. Creating the joinItemsAsNaturalLanguage() Method

First, let’s look at the method implementation and then understand how it works:

String joinItemsAsNaturalLanguage(List<String> list, boolean oxfordComma) {
    if (list.size() < 3) {
        return String.join(" and ", list);
    }
    // list has at least three elements
    int lastIdx = list.size() - 1;

    StringBuilder sb = new StringBuilder();
    return sb.append(String.join(", ", list.subList(0, lastIdx)))
      .append(oxfordComma ? ", and " : " and ")
      .append(list.get(lastIdx))
      .toString();
}

Now, let’s walk through the code quickly. First, we handle cases where the list contains less than three elements using String.join(” and “, list). 

Then, if the list contains three or more strings, we take “, “ as the separator to join the elements in a sublist of the input, which excludes the last string. Finally, we concatenate the joined result with the last element with “and”. Of course, the oxfordComma option is considered as well.

It’s worth noting that we shouldn’t take the approach of joining all elements by commas first and replacing the last comma with “and”This is because the last element might contain commas, too.

Let’s test our solution without an Oxford comma:

assertEquals("", joinItemsAsNaturalLanguage(emptyList(), false));
assertEquals("A", joinItemsAsNaturalLanguage(List.of("A"), false));
assertEquals("A and B", joinItemsAsNaturalLanguage(List.of("A", "B"), false));
assertEquals("A, B, C, D and I have a comma (,)", joinItemsAsNaturalLanguage(List.of("A", "B", "C", "D", "I have a comma (,)"), false));

Finally, let’s test with an Oxford comma:

assertEquals("", joinItemsAsNaturalLanguage(emptyList(), true));
assertEquals("A", joinItemsAsNaturalLanguage(List.of("A"), true));
assertEquals("A and B", joinItemsAsNaturalLanguage(List.of("A", "B"), true));
assertEquals("A, B, C, D, and I have a comma (,)", joinItemsAsNaturalLanguage(List.of("A", "B", "C", "D", "I have a comma (,)"), true));

4. Conclusion

In this article, we discussed the problem of joining a list of strings in a natural language-like manner. Also, we learned how to create a method to solve this problem.
As always, the complete source code for the examples is available over on GitHub.

Course – LS – All

Get started with Spring and Spring Boot, through the Learn Spring course:

>> CHECK OUT THE COURSE
res – REST with Spring (eBook) (everywhere)
Comments are open for 30 days after publishing a post. For any issues past this date, use the Contact form on the site.