eBook – Guide Spring Cloud – NPI EA (cat=Spring Cloud)
announcement - icon

Let's get started with a Microservice Architecture with Spring Cloud:

>> Join Pro and download the eBook

eBook – Mockito – NPI EA (tag = Mockito)
announcement - icon

Mocking is an essential part of unit testing, and the Mockito library makes it easy to write clean and intuitive unit tests for your Java code.

Get started with mocking and improve your application tests using our Mockito guide:

Download the eBook

eBook – Java Concurrency – NPI EA (cat=Java Concurrency)
announcement - icon

Handling concurrency in an application can be a tricky process with many potential pitfalls. A solid grasp of the fundamentals will go a long way to help minimize these issues.

Get started with understanding multi-threaded applications with our Java Concurrency guide:

>> Download the eBook

eBook – Reactive – NPI EA (cat=Reactive)
announcement - icon

Spring 5 added support for reactive programming with the Spring WebFlux module, which has been improved upon ever since. Get started with the Reactor project basics and reactive programming in Spring Boot:

>> Join Pro and download the eBook

eBook – Java Streams – NPI EA (cat=Java Streams)
announcement - icon

Since its introduction in Java 8, the Stream API has become a staple of Java development. The basic operations like iterating, filtering, mapping sequences of elements are deceptively simple to use.

But these can also be overused and fall into some common pitfalls.

To get a better understanding on how Streams work and how to combine them with other language features, check out our guide to Java Streams:

>> Join Pro and download the eBook

eBook – Jackson – NPI EA (cat=Jackson)
announcement - icon

Do JSON right with Jackson

Download the E-book

eBook – HTTP Client – NPI EA (cat=Http Client-Side)
announcement - icon

Get the most out of the Apache HTTP Client

Download the E-book

eBook – Maven – NPI EA (cat = Maven)
announcement - icon

Get Started with Apache Maven:

Download the E-book

eBook – Persistence – NPI EA (cat=Persistence)
announcement - icon

Working on getting your persistence layer right with Spring?

Explore the eBook

eBook – RwS – NPI EA (cat=Spring MVC)
announcement - icon

Building a REST API with Spring?

Download the E-book

Course – LS – NPI EA (cat=Jackson)
announcement - icon

Get started with Spring and Spring Boot, through the Learn Spring course:

>> LEARN SPRING
Course – RWSB – NPI EA (cat=REST)
announcement - icon

Explore Spring Boot 3 and Spring 6 in-depth through building a full REST API with the framework:

>> The New “REST With Spring Boot”

Course – LSS – NPI EA (cat=Spring Security)
announcement - icon

Yes, Spring Security can be complex, from the more advanced functionality within the Core to the deep OAuth support in the framework.

I built the security material as two full courses - Core and OAuth, to get practical with these more complex scenarios. We explore when and how to use each feature and code through it on the backing project.

You can explore the course here:

>> Learn Spring Security

Course – LSD – NPI EA (tag=Spring Data JPA)
announcement - icon

Spring Data JPA is a great way to handle the complexity of JPA with the powerful simplicity of Spring Boot.

Get started with Spring Data JPA through the guided reference course:

>> CHECK OUT THE COURSE

Partner – Moderne – NPI EA (cat=Spring Boot)
announcement - icon

Refactor Java code safely — and automatically — with OpenRewrite.

Refactoring big codebases by hand is slow, risky, and easy to put off. That’s where OpenRewrite comes in. The open-source framework for large-scale, automated code transformations helps teams modernize safely and consistently.

Each month, the creators and maintainers of OpenRewrite at Moderne run live, hands-on training sessions — one for newcomers and one for experienced users. You’ll see how recipes work, how to apply them across projects, and how to modernize code with confidence.

Join the next session, bring your questions, and learn how to automate the kind of work that usually eats your sprint time.

Course – LJB – NPI EA (cat = Core Java)
announcement - icon

Code your way through and build up a solid, practical foundation of Java:

>> Learn Java Basics

Partner – LambdaTest – NPI EA (cat= Testing)
announcement - icon

Distributed systems often come with complex challenges such as service-to-service communication, state management, asynchronous messaging, security, and more.

Dapr (Distributed Application Runtime) provides a set of APIs and building blocks to address these challenges, abstracting away infrastructure so we can focus on business logic.

In this tutorial, we'll focus on Dapr's pub/sub API for message brokering. Using its Spring Boot integration, we'll simplify the creation of a loosely coupled, portable, and easily testable pub/sub messaging system:

>> Flexible Pub/Sub Messaging With Spring Boot and Dapr

1. Overview

Working with strings is a fundamental task in Java programming, and at times, we need to split a string into multiple substrings for further processing. Whether it’s parsing user input or processing data files, knowing how to break strings effectively is essential.

In this tutorial, we’ll explore different approaches and techniques for breaking an input string into a string array or list containing digit and non-digit string elements in the original order.

2. Introduction to the Problem

As usual, let’s understand the problem through examples.

Let’s say we have two input strings:

String INPUT1 = "01Michael Jackson23Michael Jordan42Michael Bolton999Michael Johnson000";
String INPUT2 = "Michael Jackson01Michael Jordan23Michael Bolton42Michael Johnson999Great Michaels";

As the examples above show, both strings consist of consecutive digit and non-digit characters. For example, consecutive digit substrings in INPUT1 are “01“, “23“, “42“, “999“, and “000“. The non-digit substrings are “Michael Jackson“, “Michael Jordan“, “Michael Bolton“, and so on.

INPUT2 is similar. The difference is it starts with a non-digit string. Therefore, we can conclude a few input characteristics:

  • The length of digit or non-digit substrings is dynamic.
  • The input string can start with a digit or non-digit substring.

We aim to break the input string into an array or list of these string elements:

String[] EXPECTED1 = new String[] { "01", "Michael Jackson", "23", "Michael Jordan", "42", "Michael Bolton", "999", "Michael Johnson", "000" };
List<String> EXPECTED_LIST1 = Arrays.asList(EXPECTED1);

String[] EXPECTED2 = new String[] { "Michael Jackson", "01", "Michael Jordan", "23", "Michael Bolton", "42", "Michael Johnson", "999", "Great Michaels" };
List<String> EXPECTED_LIST2 = Arrays.asList(EXPECTED2);

In this tutorial, we’ll solve this problem using both regex-based and non-regex-based approaches. Further, we’ll discuss their performances at the end.

For simplicity, we’ll use unit test assertions to verify whether each approach works as expected.

3. Using the String.split() Method

First, let’s solve this problem using a regex-based approach. We know that the String.split() method is a handy tool for splitting a String into an array. For example: “a, b, c, d”.split(“, “) returns a string array: {“a”, “b”, “c”, “d”}.

So, using the split() method could be the first idea we came up with to solve our problem. Then, we need to find a regex pattern as the separator and guide split() to get the expected result. However, we may realize one difficulty when we think about it twice.

Let’s revisit the “a, b, c, d”.split() example. We used “, ” as the separator regex pattern and got the string elements in the array result: “a”, “b”, “c”, and “d”. If we look at the result string elements, we’ll see all matched separators (“, “) aren’t in the result string array.

However, if we look at the inputs and expected outputs of our problem, every character in the input appears in the result array or list. Therefore, if we want to use split() to solve the problem, we must use a pattern of zero-length assertions, for example, the lookaround (lookahead and lookbehind) assertions. Next, let’s analyze our input string:

01[!]Michael Jackson[!]23[!]Michael Jordan[!]42[!]Michael Bolton...

To make it clear, we marked desired separators using ‘[!]‘ in the input above. Each separator sits either between a \d (digit character) and a \D (non-digit character) or between a \D and a \d. If we translate this into a lookaround regex pattern, it’s (?<=\D)(?=\d)|(?<=\d)(?=\D).

Next, let’s write a test to verify if using split(), with this pattern, on the two inputs produces the desired results:

String splitRE = "(?<=\\D)(?=\\d)|(?<=\\d)(?=\\D)";
String[] result1 = INPUT1.split(splitRE);
assertArrayEquals(EXPECTED1, result1);

String[] result2 = INPUT2.split(splitRE);
assertArrayEquals(EXPECTED2, result2);

The test passes if we give it a run. So, we’ve solved the problem using the split() method.

Next, let’s solve the problem using a non-regex approach.

4. A Non-Regex-Based Approach

We’ve seen how to solve the problem using the regex-based split() approach. Alternatively, we can solve it without using pattern matching.

The idea to achieve that is to check through all characters from the beginning of the input string. Next, let’s first look at the implementation and understand how it works:

enum State {
    INIT, PARSING_DIGIT, PARSING_NON_DIGIT
}

List<String> parseString(String input) {
    List<String> result = new ArrayList<>();
    int start = 0;
    State state = INIT;
    for (int i = 0; i < input.length(); i++) {
        if (input.charAt(i) >= '0' && input.charAt(i) <= '9') {
            if (state == PARSING_NON_DIGIT) { // non-digit to digit, get the substring as an element
                result.add(input.substring(start, i));
                start = i;
            }
            state = PARSING_DIGIT;
        } else {
            if (state == PARSING_DIGIT) { // digit to non-digit, get the substring as an element
                result.add(input.substring(start, i));
                start = i;
            }
            state = PARSING_NON_DIGIT;
        }
    }
    result.add(input.substring(start)); // add the last part
    return result;
}

Now, let’s walk through the code above quickly and understand how it works:

  • First, we initialize an empty ArrayList called result to store the extracted elements.
  • int start = 0; – This variable start keeps track of the start index of each substring during the iteration later.
  • The state variable is an enum, which indicates the state while iterating through the string.
  • Then, we use a for loop to iterate through the input string characters and check each character’s type.
  • If the current character is a digit (09) and a non-digit to digit transition, it means an element has ended. So, we add the substring from start to i (exclusive) to the result list. Also, we update the start index to the current index i and set state to the PARSING_DIGIT state.
  • The else block follows a similar logic and handles the digit to non-digit transition scenario.
  • After the for loop ends, we shouldn’t forget to add the last part of the string to the result list by using input.substring(start).

Next, let’s test the parseString() method with our two inputs:

List<String> result1 = parseString(INPUT1);
assertEquals(EXPECTED_LIST1, result1);

List<String> result2 = parseString(INPUT2);
assertEquals(EXPECTED_LIST2, result2);

If we run the test, it passes. So, our parseString() method does the job.

5. Performance

So far, we’ve addressed two solutions to the problem, regex-based and non-regex-based. The regex-based split() solution is pretty straightforward, just one single method call. On the contrary, our dozen-line self-made parseString() method requires controlling every single character in the input on our own. Then, some of us may ask, why’d we introduce or even use the self-made method to solve the problem?

The answer is “performance.”

Although our parseString() solution looks lengthy and requires manual control of each character, it’s faster than the regex-based solution. Let’s understand the reasons for this:

  • The split() solution requires compiling the regex pattern and applying pattern matching. These operations are considered computationally expensive, especially for complex patterns. However, on the other hand, the parseString() method uses a simple enum-based state machine to track transitions between digit and non-digit characters. It allows for direct comparisons and avoids the complexity of regex pattern matching and lookarounds.
  • In the parseString() method, substrings are extracted directly using the substring() method. This approach avoids unnecessary object creation and memory allocations that may occur when using the split() method with regex. Further, by directly extracting substrings using known indices, the parseString() method optimizes memory usage and potentially improves performance.

However, the difference in performance may be negligible if the input string isn’t considerably long.

Next, let’s benchmark the performance of these two approaches. We’ll use JMH (the Java Microbenchmark Harness) to do that. This is because JMH allows us to easily handle benchmarking factors, such as JVM warm-up, dead code elimination, and so on:

@State(Scope.Benchmark)
@Threads(1)
@BenchmarkMode(Mode.Throughput)
@Fork(warmups = 1, value = 1)
@Warmup(iterations = 2, time = 10, timeUnit = TimeUnit.MILLISECONDS)
@OutputTimeUnit(TimeUnit.MILLISECONDS)
public class BenchmarkLiveTest {
    private static final String INPUT = "01Michael Jackson23Michael Jordan42Michael Bolton999Michael Johnson000";

    @Param({ "10000" })
    public int iterations;

    @Benchmark
    public void regexBased(Blackhole blackhole) {
        blackhole.consume(INPUT.split("(?<=\\D)(?=\\d)|(?<=\\d)(?=\\D)"));
    }

    @Benchmark
    public void nonRegexBased(Blackhole blackhole) {
        blackhole.consume(parseString(INPUT));
    }

    @Test
    public void benchmark() throws Exception {
        String[] argv = {};
        org.openjdk.jmh.Main.main(argv);
    }
}

As the above class shows, we benchmark the two approaches in 10k iterations using the same input. Of course, we won’t dive into JMH and understand each JMH annotation’s meaning. But two annotations are important for us to understand the final report: @OutputTimeUnit(TimeUnit.MILLISECONDS) and @BenchmarkMode(Mode.Throughput). This combination means we measure how many times we can run each approach per millisecond. 

Next, let’s take a look at the result JMH generates:

Benchmark                        (iterations)   Mode  Cnt     Score     Error   Units
BenchmarkLiveTest.nonRegexBased         10000  thrpt    5  3880.989 ± 134.021  ops/ms
BenchmarkLiveTest.regexBased            10000  thrpt    5   297.282 ±  24.818  ops/ms

As we can see, the non-regex-based solution’s throughput is over 13 (3880/297 = 13.06) times more than the regex-based solution. Therefore, when we need to handle long strings in a performance-critical application, we should choose parseString() over the split() solution.

6. Conclusion

In this article, we’ve explored regex-based (split()) and non-regex-based (parseString()) approaches to breaking an input string into a string array or list containing digit elements and non-digit string elements in the original order.

The split() solution is compact and straightforward. However, when dealing with long input strings, it can be significantly slower than the self-made parseString() solution.

The code backing this article is available on GitHub. Once you're logged in as a Baeldung Pro Member, start learning and coding on the project.
Baeldung Pro – NPI EA (cat = Baeldung)
announcement - icon

Baeldung Pro comes with both absolutely No-Ads as well as finally with Dark Mode, for a clean learning experience:

>> Explore a clean Baeldung

Once the early-adopter seats are all used, the price will go up and stay at $33/year.

eBook – HTTP Client – NPI EA (cat=HTTP Client-Side)
announcement - icon

The Apache HTTP Client is a very robust library, suitable for both simple and advanced use cases when testing HTTP endpoints. Check out our guide covering basic request and response handling, as well as security, cookies, timeouts, and more:

>> Download the eBook

eBook – Java Concurrency – NPI EA (cat=Java Concurrency)
announcement - icon

Handling concurrency in an application can be a tricky process with many potential pitfalls. A solid grasp of the fundamentals will go a long way to help minimize these issues.

Get started with understanding multi-threaded applications with our Java Concurrency guide:

>> Download the eBook

eBook – Java Streams – NPI EA (cat=Java Streams)
announcement - icon

Since its introduction in Java 8, the Stream API has become a staple of Java development. The basic operations like iterating, filtering, mapping sequences of elements are deceptively simple to use.

But these can also be overused and fall into some common pitfalls.

To get a better understanding on how Streams work and how to combine them with other language features, check out our guide to Java Streams:

>> Join Pro and download the eBook

eBook – Persistence – NPI EA (cat=Persistence)
announcement - icon

Working on getting your persistence layer right with Spring?

Explore the eBook

Course – LS – NPI EA (cat=REST)

announcement - icon

Get started with Spring Boot and with core Spring, through the Learn Spring course:

>> CHECK OUT THE COURSE

Partner – Moderne – NPI EA (tag=Refactoring)
announcement - icon

Modern Java teams move fast — but codebases don’t always keep up. Frameworks change, dependencies drift, and tech debt builds until it starts to drag on delivery. OpenRewrite was built to fix that: an open-source refactoring engine that automates repetitive code changes while keeping developer intent intact.

The monthly training series, led by the creators and maintainers of OpenRewrite at Moderne, walks through real-world migrations and modernization patterns. Whether you’re new to recipes or ready to write your own, you’ll learn practical ways to refactor safely and at scale.

If you’ve ever wished refactoring felt as natural — and as fast — as writing code, this is a good place to start.

eBook Jackson – NPI EA – 3 (cat = Jackson)