Google Protobuf ByteString vs. Byte[]

Last updated: October 2, 2025

Written by: Sudarshan Hiray

Reviewed by: David Martinez

Data

Modern software architecture is often broken. Slow delivery leads to missed opportunities, innovation is stalled due to architectural complexities, and engineering resources are exceedingly expensive.

Orkes is the leading workflow orchestration platform built to enable teams to transform the way they develop, connect, and deploy applications, microservices, AI agents, and more.

With Orkes Conductor managed through Orkes Cloud, developers can focus on building mission critical applications without worrying about infrastructure maintenance to meet goals and, simply put, taking new products live faster and reducing total cost of ownership.

Try a 14-Day Free Trial of Orkes Conductor today.

Orkes is the leading workflow orchestration platform built to enable teams to transform the way they develop, connect, and deploy applications, microservices, AI agents, and more.

Try a 14-Day Free Trial of Orkes Conductor today.

Browser testing is essential if you have a website or web applications that users interact with. Manual testing can be very helpful to an extent, but given the multiple browsers available, not to mention versions and operating system, testing everything manually becomes time-consuming and repetitive.

To help automate this process, Selenium is a popular choice for developers, as an open-source tool with a large and active community. What's more, we can further scale our automation testing by running on theLambdaTest cloud-based testing platform.

Read more through our step-by-step tutorial on how to set up Selenium tests with Java and run them on LambdaTest:

>> Automated Browser Testing With Selenium

Orkes is the leading workflow orchestration platform built to enable teams to transform the way they develop, connect, and deploy applications, microservices, AI agents, and more.

Try a 14-Day Free Trial of Orkes Conductor today.

Refactor Java code safely — and automatically — with OpenRewrite.

Refactoring big codebases by hand is slow, risky, and easy to put off. That’s where OpenRewrite comes in. The open-source framework for large-scale, automated code transformations helps teams modernize safely and consistently.

Each month, the creators and maintainers of OpenRewrite at Moderne run live, hands-on training sessions — one for newcomers and one for experienced users. You’ll see how recipes work, how to apply them across projects, and how to modernize code with confidence.

Join the next session, bring your questions, and learn how to automate the kind of work that usually eats your sprint time.

1. Introduction

When working with Google’s Protocol Buffers (Protobuf) in Java, we inevitably encounter the need to handle binary data. This often leads to a choice between the standard byte[] and Protobuf’s custom ByteString class. While both represent a sequence of bytes, they have fundamental differences in their design and intended use.

In this article, we’ll explore the characteristics of both types, highlight their key differences with code examples, and provide guidance on when to use each for optimal performance and maintainability.

2. Defining Maven Dependencies

To start, we’ll need to include the protobuf-java dependency in our project:

<dependency>
    <groupId>com.google.protobuf</groupId>
    <artifactId>protobuf-java</artifactId>
    <version>4.31.1</version>
</dependency>

This dependency provides access to the ByteString class and the necessary Protobuf APIs.

3. Understanding byte[]

The byte[] is a core Java data structure for representing a sequence of raw bytes. Its primary characteristic is mutability. This allows us to modify its elements directly after creation, which is essential for tasks like building a buffer to read data from a stream.

Let’s illustrate its mutable nature with a simple example test. We’ll define a byte array and then replace an element in it:

@Test
public void givenByteArray_whenModified_thenChangesPersist() {
    // Here, we'll initialize a mutable buffer
    byte[] data = new byte[4];
        
    // We'll read data into the buffer
    ByteArrayInputStream inputStream = new ByteArrayInputStream(new byte[] { 0x01, 0x02, 0x03, 0x04 });
    try {
        inputStream.read(data);
    } catch (IOException e) {
        e.printStackTrace();
    }

    // Note, the first byte is 1
    assertEquals(1, data[0]);

    // We can directly modify the first byte
    data[0] = 0x05;
        
    // The modification is persisted
    assertEquals(5, data[0]);
}

As shown in the above test, a byte[] can be changed in-place, making it a flexible choice for scenarios where we need to manipulate the contents of a buffer.

4. Understanding ByteString

ByteString is a class provided by the Protobuf library for handling sequences of bytes. Unlike byte[], ByteString is immutable. Once created, its contents cannot be altered, which is similar to how the String class works in Java.

This immutability offers several advantages like thread safety because an immutable object is inherently safe to share across multiple threads without synchronization.

Also, increased efficiency because operations like substring() and concat() are highly optimized. Instead of copying all the data, these methods often create new ByteString objects that share a reference to the original data, which is far more efficient in terms of both memory and performance.

Let’s look at the immutability of ByteString:

@Test
public void givenByteString_whenCreated_thenIsImmutable() {
    // We'll create an immutable ByteString from a mutable byte array
    byte[] originalArray = new byte[] { 0x01, 0x02, 0x03, 0x04 };
    ByteString byteString = ByteString.copyFrom(originalArray);
        
    // The value of the first byte is 1
    assertEquals(1, byteString.byteAt(0));
        
    // We'll try to modify the original array
    originalArray[0] = 0x05;
        
     // The ByteString's contents remain unchanged
     assertEquals(1, byteString.byteAt(0));
}

The test confirms that even if the source byte[] is modified, the ByteString remains unchanged. This behavior is key to its reliability within Protobuf.

5. Key Differences

The contrasting natures of byte[] and ByteString lead to key differences that influence our design decisions.

5.1. Mutability vs. Immutability

This is the most fundamental difference. byte[] is mutable, making it ideal for data that needs to be modified in place, such as in-memory buffers or during stream processing.

In contrast, ByteString is immutable, which ensures data integrity and thread-safety. This makes it the perfect choice for persistent or shared data, especially within the context of a message format.

5.2. Performance

For simple read/write operations, performance is similar. However, ByteString demonstrates its true efficiency during more complex operations like concatenation.

To concatenate two byte[] arrays, we must create a new, larger array and copy all the data, which can be an expensive operation. ByteString‘s concat() method is highly optimized, often creating a new instance that references both original objects without performing a full data copy, which significantly reduces memory allocations.

5.3. API and Protobuf Integration

byte[] has a minimal API, so most complex operations require custom logic. ByteString, on the other hand, offers a rich API tailored for binary data, including methods like startsWith(), substring(), and indexOf().

Most importantly, ByteString is the native type for the bytes fields within Protobuf messages. It ensures seamless and efficient serialization and deserialization. We can see this by looking at a simple Protobuf definition:

message UserData {
  string name = 1;
  bytes profile_image = 2;
}

The generated Java class will represent the profile_image field as a ByteString, not a byte[]. This integration is a core part of Protobuf’s design.

6. Conversion Between Types

When working with common scenarios, we’ll often need to convert between byte[] and ByteString when interoperating with standard Java APIs.

6.1. byte[] to ByteString

To convert a byte[] to a ByteString, we use the static ByteString.copyFrom() method. This operation creates a new ByteString and copies the data, ensuring the new instance’s immutability:

@Test
public void givenByteArray_whenCopiedToByteString_thenDataIsCopied() {
    // We'll start with a mutable byte array
    byte[] byteArray = new byte[] { 0x01, 0x02, 0x03 };
        
    // Create a new ByteString from it
    ByteString byteString = ByteString.copyFrom(byteArray);

    // We'll assert that the data is the same
    assertEquals(byteArray[0], byteString.byteAt(0));
        
    // Here, we change the original array
    byteArray[0] = 0x05;

    // Note, the ByteString remains unchanged, confirming the copy
    assertEquals(1, byteString.byteAt(0));
    assertNotSame(byteArray, byteString.toByteArray());
}

6.2. ByteString to byte[]

The conversion in the other direction uses the toByteArray() method. This method returns a new byte[] instance with a copy of the ByteString‘s data:

@Test
public void givenByteString_whenConvertedToByteArray_thenDataIsCopied() {
    // We'll start with an immutable ByteString
    ByteString byteString = ByteString.copyFromUtf8("Baeldung");
        
    // Create a mutable byte array from it
    byte[] byteArray = byteString.toByteArray();

    // Here, the byte array now has a copy of the data
    assertEquals('B', (char) byteArray[0]);
        
    // We'll change the new array
    byteArray[0] = 'X';

    // Note, the original ByteString remains unchanged
    assertEquals('B', (char) byteString.byteAt(0));
    assertNotSame(byteArray, byteString.toByteArray());
}

It’s essential to note that both conversions involve a complete data copy, which can introduce overhead for large byte sequences.

7. Conclusion

In this article, we first explored the fundamental differences between byte[] and ByteString, starting with the mutable nature of byte[] and its use in low-level stream operations. We also examined the key differences in performance and API, and finally, saw how to convert between the two types.

Ultimately, the choice between them comes down to a simple principle: we use byte[] for mutable, general-purpose buffers, and we use ByteString as the default for all binary data in our Protobuf messages.

The code backing this article is available on GitHub. Once you're logged in as a Baeldung Pro Member, start learning and coding on the project.

Orkes is the leading workflow orchestration platform built to enable teams to transform the way they develop, connect, and deploy applications, microservices, AI agents, and more.

Try a 14-Day Free Trial of Orkes Conductor today.

Orkes is the leading workflow orchestration platform built to enable teams to transform the way they develop, connect, and deploy applications, microservices, AI agents, and more.

Try a 14-Day Free Trial of Orkes Conductor today.

Modern Java teams move fast — but codebases don’t always keep up. Frameworks change, dependencies drift, and tech debt builds until it starts to drag on delivery. OpenRewrite was built to fix that: an open-source refactoring engine that automates repetitive code changes while keeping developer intent intact.

The monthly training series, led by the creators and maintainers of OpenRewrite at Moderne, walks through real-world migrations and modernization patterns. Whether you’re new to recipes or ready to write your own, you’ll learn practical ways to refactor safely and at scale.

If you’ve ever wished refactoring felt as natural — and as fast — as writing code, this is a good place to start.