Java Split String Performance

Last updated: October 4, 2025

Written by: baeldung

Reviewed by: Hiks Gerganov

Java String

Modern software architecture is often broken. Slow delivery leads to missed opportunities, innovation is stalled due to architectural complexities, and engineering resources are exceedingly expensive.

Orkes is the leading workflow orchestration platform built to enable teams to transform the way they develop, connect, and deploy applications, microservices, AI agents, and more.

With Orkes Conductor managed through Orkes Cloud, developers can focus on building mission critical applications without worrying about infrastructure maintenance to meet goals and, simply put, taking new products live faster and reducing total cost of ownership.

Try a 14-Day Free Trial of Orkes Conductor today.

Orkes is the leading workflow orchestration platform built to enable teams to transform the way they develop, connect, and deploy applications, microservices, AI agents, and more.

Try a 14-Day Free Trial of Orkes Conductor today.

Orkes is the leading workflow orchestration platform built to enable teams to transform the way they develop, connect, and deploy applications, microservices, AI agents, and more.

Try a 14-Day Free Trial of Orkes Conductor today.

Refactor Java code safely — and automatically — with OpenRewrite.

Refactoring big codebases by hand is slow, risky, and easy to put off. That’s where OpenRewrite comes in. The open-source framework for large-scale, automated code transformations helps teams modernize safely and consistently.

Each month, the creators and maintainers of OpenRewrite at Moderne run live, hands-on training sessions — one for newcomers and one for experienced users. You’ll see how recipes work, how to apply them across projects, and how to modernize code with confidence.

Join the next session, bring your questions, and learn how to automate the kind of work that usually eats your sprint time.

1. Overview

String manipulation is a common operation in most programming languages, including Java. Whether for parsing log files, reading CSV data, or processing user input, we often need to break strings into smaller parts based on a delimiter.

While Java provides multiple approaches for splitting strings, their performance can vary significantly depending on the chosen method and the size of the data.

In this tutorial, we’ll explore different ways to split strings in Java, compare their performance, and provide best practices for selecting the most efficient approach.

2. Why Performance Matters in String Splitting

When working with small strings, performance differences may not matter. However, in applications that deal with large amounts of text data, selecting the right string splitting method can have a significant impact on speed and memory usage.

For instance, let’s check some scenarios where performance matters:

Log processing: Some systems parse millions of log lines into fields, where regex-based splitting can slow intake.
Data parsing: CSV or TSV files with millions of rows can quickly expose the cost of inefficient splitting.
High-throughput services: APIs that tokenize query parameters or headers, handling thousands of requests per second, inefficient splitting can increase CPU load and degrade system responsiveness.
Memory usage: Frequent string splitting creates numerous temporary objects, increasing garbage collection pressure.

Choosing the right string-splitting method impacts performance, scalability, and responsiveness in production systems.

3. Java String Splitting Approaches

Java offers several ways to split strings, each with its own strengths, limitations, and performance. While all methods eventually break strings into smaller parts, their efficiency and ease of use vary depending on the input size, pattern complexity, and memory constraints.

3.1. Using String.split()

The String.split() method accepts a delimiter expressed as a regular expression and breaks the string at every occurrence of the delimiter, returning an array of substrings.

To demonstrate, let’s split a string:

public class SplitBasic {
    public static void main(String[] args) {
        String text = "apple,banana,orange,grape";
        String[] fruits = text.split(",");

        for (String fruit : fruits) {
            System.out.println(fruit);
        }
    }
}

Here, the comma is the delimiter. The method scans the string, finds matches for the delimiter, and slices the string into parts:

apple
banana
orange
grape

Furthermore, since this method uses regex, it can handle more complex cases than simple delimiters:

public class SplitWhitespace {
    public static void main(String[] args) {
        String text = "apple   banana\tgrape";
        String[] parts = text.split("\\s+");

        for (String part : parts) {
            System.out.println(part);
        }
    }
}

Above, the \\s+ regex matches one or more whitespace characters:

apple
banana
grape

Additionally, String.split() also accepts a limit parameter, enabling us to specify how many times the string should be split:

public class SplitLimit {
    public static void main(String[] args) {
        String text = "a,b,c,d,e";

        String[] parts = text.split(",", 3);

        for (String part : parts) {
            System.out.println(part);
        }
    }
}

Here, the string is only split into three parts; the remainder, c,d,e, remains intact:

a
b
c,d,e

The above approach is useful when we only need the first few fields in a record.

In summary, we can handle multiple delimiters and complex patterns in a single call, as this method supports regular expression syntax. Also, it’s quick to implement, making it perfect for performing small tasks.

On the other hand, the method comes with regular expression (regex) overhead, meaning even simple delimiters, such as a comma, may be processed further, which can slow things down. In addition, when used frequently on large datasets, the creation of new arrays and substrings by each call increases garbage collection pressure.

3.2. Using Pattern.split()

The Pattern class in Java provides a way to work with compiled regular expressions. Unlike String.split(), which compiles the regex on every call, Pattern.split() enables us to precompile the pattern once and reuse it across multiple operations. Because of this, it’s a better choice when handling large datasets or repeatedly splitting strings in loops.

For instance, let’s split on whitespace:

import java.util.regex.Pattern;

public class PatternSplitExample {
    public static void main(String[] args) {
        String logEntry = "2025-09-18 10:35:22 INFO User=samuel Action=login Status=success";

        Pattern whitespace = Pattern.compile("\\s+");

        String[] fields = whitespace.split(logEntry);

        for (String field : fields) {
            System.out.println(field);
        }
    }
}

Here, we precompiled the \\s+ pattern once, and then used it to split the string:

2025-09-18
10:35:22
INFO
User=samuel
Action=login
Status=success

When processing thousands of lines in a loop, it would save a lot of time compared to using the String.split() method.

It can also handle complex delimiters:

import java.util.regex.Pattern;

public class PatternSplitMultiDelimiter {
    public static void main(String[] args) {
        String text = "apple,banana;grape orange";

        // Compile regex that matches comma, semicolon, or space
        Pattern pattern = Pattern.compile("[,; ]");

        String[] parts = pattern.split(text);

        for (String part : parts) {
            System.out.println(part);
        }
    }
}

Above, we split a string on commas, semicolons, or spaces:

apple
banana
grape
orange

Using Pattern.compile() and Pattern.split() is often better than the String.split() method because it avoids the repeated cost of compiling the regular expressions every time we split a string.

3.3. Using String.indexOf() and substring()

Instead of relying on regex or tokenizers, we can directly scan the string for delimiter positions using String.indexOf(), then extract the substrings between delimiter positions using substring(), avoiding the overhead of regex processing, object creation in tokenizers, and other abstractions.

To demonstrate, let’s split a comma-separated string:

public class ManualSplitExample {
    public static void main(String[] args) {
        String text = "apple,banana,grape";
        int start = 0;
        int index;

        while ((index = text.indexOf(",", start)) >= 0) {
            String token = text.substring(start, index);
            System.out.println(token);
            start = index + 1;
        }

        String lastToken = text.substring(start);
        System.out.println(lastToken);
    }
}

Above, we manually split the string based on the comma delimiter:

apple
banana
grape

Specifically, we use indexOf(“,”, start) to search for the comma delimiter starting at position start. If found, we extract the substring between start and the delimiter index. Then, we move start just after the delimiter and continue scanning until we extract the last token after the final delimiter.

This approach can be the fastest for small to medium strings with simple delimiters, but performance degrades on large strings compared to String.split(). It also avoids the overhead of compiling and executing regular expressions.

Furthermore, this approach provides direct control over the parsing process, enabling us to handle edge cases such as trailing delimiters, empty tokens, and whitespace trimming.

However, since it doesn’t support regular expressions, indexOf() is often not suitable for complex delimiter patterns like multiple whitespace characters. Also, because of the fairly manual implementation, it’s easier to introduce bugs if we forget to handle certain edge cases, such as delimiters at the start or end of a string.

4. Performance Benchmarking

Now that we’ve explored the different approaches, let’s measure their performance. In this case, we use the Java Microbenchmark Harness and Maven to do so.

4.1. Maven Dependencies

First, we’ll add the JMH dependencies to our Maven project:

  <dependencies>
    <!-- JMH Benchmarking -->
    <dependency>
      <groupId>org.openjdk.jmh</groupId>
      <artifactId>jmh-core</artifactId>
      <version>1.36</version>
    </dependency>
    <dependency>
      <groupId>org.openjdk.jmh</groupId>
      <artifactId>jmh-generator-annprocess</artifactId>
      <version>1.36</version>
      <scope>provided</scope>
    </dependency>

With all dependencies taken care of, we can write the actual tests.

4.2. Implement Split String Performance Tests

Next, let’s create a new class SplitStringPerformance:

@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MICROSECONDS)
@Fork(value = 1)
@Warmup(iterations = 5)
@Measurement(iterations = 10)
@State(Scope.Thread)
public class SplitStringPerformance {

    @Param({"10", "1000", "100000"})
    public int tokenCount;

    private static final String DELIM = ",";
    private String text;
    private Pattern commaPattern;

    @Setup(Level.Trial)
    public void setup() {
        StringBuilder sb = new StringBuilder(tokenCount * 8);
        for (int i = 0; i < tokenCount; i++) {
            sb.append("token").append(i);
            if (i < tokenCount - 1) sb.append(DELIM);
        }
        text = sb.toString();
        commaPattern = Pattern.compile(",");
    }

    @Benchmark
    public void stringSplit(Blackhole bh) {
        String[] parts = text.split(DELIM);
        bh.consume(parts.length);
    }

    @Benchmark
    public void patternSplit(Blackhole bh) {
        String[] parts = commaPattern.split(text);
        bh.consume(parts.length);
    }

    @Benchmark
    public void manualSplit(Blackhole bh) {
        List<String> tokens = new ArrayList<>(tokenCount);
        int start = 0, idx;
        while ((idx = text.indexOf(DELIM, start)) >= 0) {
            tokens.add(text.substring(start, idx));
            start = idx + 1;
        }
        tokens.add(text.substring(start));
        bh.consume(tokens.size());
    }
}

Here, we use the above code to benchmark three approaches for splitting a comma-separated string:

String.split()
Pattern.split()
String.indexOf() and substring()

The benchmark runs with three input token sizes:

10
1000
100000

This way, we compare performance across small and large datasets.

4.3. Run the Tests

At this point, we should be able to run the benchmark.

First, let’s build the project:

$ mvn -DskipTests package

The command compiles the main Java source code and creates a benchmarks.jar file in the target directory.

Once the file is created, let’s run the benchmark and take a look at the results:

$ java -jar target/benchmarks.jar
...
Benchmark                    (tokenCount)  Mode  Cnt      Score      Error  Units        
SplitStringPerformance.manualSplit             10  avgt   10      0.334 ±    0.041  us/op
SplitStringPerformance.manualSplit           1000  avgt   10     46.469 ±    6.864  us/op
SplitStringPerformance.manualSplit         100000  avgt   10  22698.745 ± 4779.351  us/op
SplitStringPerformance.patternSplit            10  avgt   10      0.998 ±    0.267  us/op
SplitStringPerformance.patternSplit          1000  avgt   10    103.649 ±   19.582  us/op
SplitStringPerformance.patternSplit        100000  avgt   10  10929.489 ± 2556.689  us/op
SplitStringPerformance.stringSplit             10  avgt   10      0.606 ±    0.163  us/op
SplitStringPerformance.stringSplit           1000  avgt   10     51.525 ±   10.154  us/op
SplitStringPerformance.stringSplit         100000  avgt   10   5914.462 ± 1001.699  us/op

The output leads to two main conclusions. First, for small strings, all methods are fast.

Yet, for large strings, String.split() is the fastest overall, Pattern.split() is slower due to regex overhead, and manual splitting with indexOf() and substring() performs the worst at scale.

5. Conclusion

In this article, we discussed multiple approaches to splitting strings in Java, such as String.split(), Pattern.split(), String.indexOf(), and substring(). We also measured and compared the performance of each method.

For small inputs, all options are fast. However, for large inputs with simple delimiters, a carefully written manual scan can minimize allocations. On the other hand, for convenience, String.split() is good and benefits from internal caching. Lastly, for repeated complex patterns, precompiling with Pattern is usually best.

The code backing this article is available on GitHub. Once you're logged in as a Baeldung Pro Member, start learning and coding on the project.

Orkes is the leading workflow orchestration platform built to enable teams to transform the way they develop, connect, and deploy applications, microservices, AI agents, and more.

Try a 14-Day Free Trial of Orkes Conductor today.

Orkes is the leading workflow orchestration platform built to enable teams to transform the way they develop, connect, and deploy applications, microservices, AI agents, and more.

Try a 14-Day Free Trial of Orkes Conductor today.

Modern Java teams move fast — but codebases don’t always keep up. Frameworks change, dependencies drift, and tech debt builds until it starts to drag on delivery. OpenRewrite was built to fix that: an open-source refactoring engine that automates repetitive code changes while keeping developer intent intact.

The monthly training series, led by the creators and maintainers of OpenRewrite at Moderne, walks through real-world migrations and modernization patterns. Whether you’re new to recipes or ready to write your own, you’ll learn practical ways to refactor safely and at scale.

If you’ve ever wished refactoring felt as natural — and as fast — as writing code, this is a good place to start.

REST with Spring Boot

Learn Spring Security ▼▲

Learn Spring Security Core

Learn Spring Security OAuth

Learn Spring

Learn Spring Data JPA

Learn JUnit

Learn Maven

Learn Hibernate JPA

Learn Mockito

Learn JSON with Jackson

Full Archive

Baeldung Ebooks

About Baeldung

1. Overview

2. Why Performance Matters in String Splitting

3. Java String Splitting Approaches

3.1. Using String.split()

3.2. Using Pattern.split()

3.3. Using String.indexOf() and substring()

4. Performance Benchmarking

4.1. Maven Dependencies

4.2. Implement Split String Performance Tests

4.3. Run the Tests

5. Conclusion