Stream Large Byte[] to File With WebClient

Last updated: January 8, 2024

Written by: Ulisses Lima

Reviewed by: Luis Javier Peris

WebClient

Refactor Java code safely — and automatically — with OpenRewrite.

Refactoring big codebases by hand is slow, risky, and easy to put off. That’s where OpenRewrite comes in. The open-source framework for large-scale, automated code transformations helps teams modernize safely and consistently.

Each month, the creators and maintainers of OpenRewrite at Moderne run live, hands-on training sessions — one for newcomers and one for experienced users. You’ll see how recipes work, how to apply them across projects, and how to modernize code with confidence.

Join the next session, bring your questions, and learn how to automate the kind of work that usually eats your sprint time.

Regression testing is an important step in the release process, to ensure that new code doesn't break the existing functionality. As the codebase evolves, we want to run these tests frequently to help catch any issues early on.

The best way to ensure these tests run frequently on an automated basis is, of course, to include them in the CI/CD pipeline. This way, the regression tests will execute automatically whenever we commit code to the repository.

In this tutorial, we'll see how to create regression tests using Selenium, and then include them in our pipeline using GitHub Actions:, to be run on the LambdaTest cloud grid:

>> How to Run Selenium Regression Tests With GitHub Actions

1. Introduction

In this quick tutorial, we’ll stream a large file from a server with WebClient. To illustrate, we’ll create a simple controller and two clients. Ultimately, we’ll learn how and when to use Spring‘s DataBuffer and DataBufferUtils.

2. Our Scenario With a Simple Server

We’ll start with a simple controller for downloading an arbitrary file. Firstly, we’ll construct a FileSystemResource, passing a file Path, then wrap it as a body to our ResponseEntity:

@RestController
@RequestMapping("/large-file")
public class LargeFileController {

    @GetMapping
    ResponseEntity<Resource> get() {
        return ResponseEntity.ok()
          .body(new FileSystemResource(Paths.get("/tmp/large.dat")));
    }
}

Secondly, we need to generate the file we’re referencing. Since the contents aren’t critical for understanding the tutorial, we’ll use fallocate to reserve a specified size on the disk without writing anything. So, let’s create our large file by running this command:

fallocate -l 128M /tmp/large.dat

Finally, we have a file that clients can download. So, we’re ready to start writing our clients.

3. WebClient With ExchangeStrategies for Large Files

We’ll start with a simple but limited WebClient to download our file. We’ll use ExchangeStrategies to raise the memory limit available for exchange() operations. This way, we can manipulate a larger number of bytes, but we’re still limited to the maximum memory available to the JVM. Let’s use bodyToMono() to get a Mono<byte[]> from the server:

public class LimitedFileDownloadWebClient {

    public static long fetch(WebClient client, String destination) {
        Mono<byte[]> mono = client.get()
          .retrieve()
          .bodyToMono(byte[].class);

        byte[] bytes = mono.block();
        
        Path path = Paths.get(destination);
        Files.write(path, bytes);
        return bytes.length;
    }

    // ...
}

In other words, we’re retrieving the entire response contents into a byte[]. Afterward, we write those bytes to our path and return the number of bytes downloaded. Let’s create a main() method to test it:

public static void main(String... args) {
    String baseUrl = args[0];
    String destination = args[1];

    WebClient client = WebClient.builder()
      .baseUrl(baseUrl)
      .exchangeStrategies(useMaxMemory())
      .build();

    long bytes = fetch(client, destination);
    System.out.printf("downloaded %d bytes", bytes);
}

Also, we’ll need two arguments: the download URL and a destination to save it locally. To avoid a DataBufferLimitException in our client, let’s configure an exchange strategy to limit the number of bytes loadable into memory. Instead of defining a fixed size, we’ll get the total memory configured for our application with Runtime. Note that this is not recommended and is just for demonstration purposes:

private static ExchangeStrategies useMaxMemory() {
    long totalMemory = Runtime.getRuntime().maxMemory();

    return ExchangeStrategies.builder()
      .codecs(configurer -> configurer.defaultCodecs()
        .maxInMemorySize((int) totalMemory)
      )
      .build();
}

To clarify, an exchange strategy customizes the way our client processes requests. In this case, we’re using the codecs() method from the builder, so we don’t replace any of the default settings.

3.1. Running Our Client With Memory Adjustments

Subsequently, we’ll pack our project as a jar in /tmp/app.jar and run our server on localhost:8081. Then, let’s define some variables and run our client from the command line:

limitedClient='com.baeldung.streamlargefile.client.LimitedFileDownloadWebClient' 
endpoint='http://localhost:8081/large-file' 
java -Xmx256m -cp /tmp/app.jar $limitedClient $endpoint /tmp/download.dat

Notice we’re allowing our application to use memory twice the size of our 128M file. Indeed, we’ll download our file and get the following output:

downloaded 134217728 bytes

On the other hand, if we don’t allocate enough memory, we’ll get an OutOfMemoryError:

$ java -Xmx64m -cp /tmp/app.jar $limitedClient $endpoint /tmp/download.dat
reactor.netty.ReactorNetty$InternalNettyException: java.lang.OutOfMemoryError: Direct buffer memory

This approach doesn’t rely on Spring Core utilities. But, it’s limited because we can’t download any file with a size close to the max memory for our application.

4. WebClient for Any File Size With DataBuffer

A safer approach is to use DataBuffer and DataBufferUtils to stream our download in chunks so that the whole file doesn’t get loaded into memory. Then, this time, we’ll use bodyToFlux() to retrieve a Flux<DataBuffer>, write it to our path, and return its size in bytes:

public class LargeFileDownloadWebClient {

    public static long fetch(WebClient client, String destination) {
        Flux<DataBuffer> flux = client.get()
          .retrieve()
          .bodyToFlux(DataBuffer.class);

        Path path = Paths.get(destination);
        DataBufferUtils.write(flux, path)
          .block();

        return Files.size(path);
    }

    // ...
}

Finally, let’s write the main method to receive our arguments, create a WebClient, and fetch our file:

public static void main(String... args) {
    String baseUrl = args[0];
    String destination = args[1];

    WebClient client = WebClient.create(baseUrl);

    long bytes = fetch(client, destination);
    System.out.printf("downloaded %d bytes", bytes);
}

And that’s it. This approach is more versatile, as we don’t depend on file or memory size. Let’s set max memory with a fourth of the size of our file and run it using the same endpoint from earlier:

client='com.baeldung.streamlargefile.client.LargeFileDownloadWebClient'
java -Xmx32m -cp /tmp/app.jar $client $endpoint /tmp/download.dat

In the end, we’ll get a successful output, even though our application had less total memory than the size of our file:

downloaded 134217728 bytes

5. Conclusion

In this article, we learned different ways to use WebClient to download an arbitrarily large file. First, we learned how to define the amount of memory available for our WebClient operations. Then, we saw the drawbacks of this approach. Most importantly, we learned how to make our client use memory efficiently.

The code backing this article is available on GitHub. Once you're logged in as a Baeldung Pro Member, start learning and coding on the project.