Let's get started with a Microservice Architecture with Spring Cloud:
Streaming Multipart Data Sequentially in Spring
Last updated: November 6, 2025
1. Overview
In modern web applications, efficiently transferring large files is crucial. Whether we’re sending multiple files to a client or receiving large uploads, we must minimize memory usage. However, Spring’s default buffered approach can bottleneck large payloads. It stores the entire file in memory or on disk before our code processes it. As a result, the application delays processing and consumes more resources.
Fortunately, Spring allows sequential streaming to avoid these limitations. This tutorial explains how to implement streaming for multipart data. Specifically, we discuss Spring MVC and Reactive WebFlux, with practical examples for uploads and downloads.
2. Default Multipart Handling in Spring
A MultipartResolver usually handles multipart requests in Spring MVC. It parses each incoming file and temporarily stores it in memory or on disk before passing it to the controller. Similarly, the default approach often loads the entire response into memory before sending it to a client.
While this method is straightforward and works for small files, it presents two major issues with larger uploads or downloads:
- High memory consumption: Large files can cause our applications to use excessive memory, which may result in slow performance or even an OutOfMemoryError.
- Delayed processing or delivery: The application must wait until all parts of the request are fully received before starting any processing or sending data, which postpones the first byte reaching the client.
These limitations make the default method unsuitable for large archives, massive datasets, or real-time uploads. A streaming approach solves the problem by processing or sending data as it arrives, without waiting for the full payload.
3. Streaming in Spring MVC
In Spring MVC applications, streaming enables us to send or receive large files incrementally, rather than buffering them entirely in memory or on disk. This approach keeps memory usage predictable, reduces latency, and enables real-time processing.
We’ll first examine streaming file uploads, then streaming file downloads, exploring both configuration and implementation techniques for each scenario.
3.1. Streaming File Uploads
In this approach, the application can process data immediately as it arrives, enabling early validation, transformation, or persistence. This ensures predictable memory usage even with multi-gigabyte uploads.
The first step is to configure the MultipartResolver to minimize buffering. Setting the file-size threshold to 0 in application.properties ensures that uploaded files are streamed directly from the request rather than being buffered in memory:
spring.servlet.multipart.max-file-size=10MB
spring.servlet.multipart.max-request-size=10MB
spring.servlet.multipart.file-size-threshold=0
Setting spring.servlet.multipart.file-size-threshold=0 disables in-memory buffering for all files. Any uploaded file, regardless of size, will be written directly to disk or processed as a stream instead of being held in memory. This setting is essential for predictable memory usage when handling large files, as it prevents sudden spikes in heap usage and allows the application to begin processing data immediately upon receipt.
With this configuration in place, controllers can receive uploaded files as instances of MultipartFile and process them incrementally:
@PostMapping("/upload")
public ResponseEntity<String> uploadFileStreaming(@RequestPart("filePart") MultipartFile filePart) throws IOException {
Path targetPath = UPLOAD_DIR.resolve(filePart.getOriginalFilename());
Files.createDirectories(targetPath.getParent());
try (InputStream inputStream = filePart.getInputStream(); OutputStream outputStream = Files.newOutputStream(targetPath)) {
inputStream.transferTo(outputStream);
}
return ResponseEntity.ok("Upload successful: " + filePart.getOriginalFilename());
}
Because the file data is read as a stream from the MultipartFile, this approach avoids buffering the entire upload in memory. The transferTo() method efficiently copies the input stream to the output stream in a memory-conscious manner. This allows the controller to process large files incrementally, keeping memory usage predictable and making it straightforward to integrate streaming uploads into existing Spring MVC controllers.
3.2. Streaming File Downloads
Spring MVC’s default behavior buffers entire responses before sending them, wasting memory and delaying delivery for large payloads. The StreamingResponseBody API solves this by writing directly to the response output stream, allowing the first file to be sent while subsequent files are still being processed.
For multiple files in a single HTTP response, we can use the multipart/mixed content type with a boundary string to separate each file in the stream:
@GetMapping("/download")
public StreamingResponseBody downloadFiles(HttpServletResponse response) throws IOException {
String boundary = "filesBoundary";
response.setContentType("multipart/mixed; boundary=" + boundary);
List<Path> files = List.of(UPLOAD_DIR.resolve("file1.txt"), UPLOAD_DIR.resolve("file2.txt"));
return outputStream -> {
try (BufferedOutputStream bos = new BufferedOutputStream(outputStream); OutputStreamWriter writer = new OutputStreamWriter(bos)) {
for (Path file : files) {
writer.write("--" + boundary + "\r\n");
writer.write("Content-Type: application/octet-stream\r\n");
writer.write("Content-Disposition: attachment; filename=\"" + file.getFileName() + "\"\r\n\r\n");
writer.flush();
Files.copy(file, bos);
bos.write("\r\n".getBytes());
bos.flush();
}
writer.write("--" + boundary + "--\r\n");
writer.flush();
}
};
}
In this example, each file is streamed directly from disk to the output stream. The explicit boundary markers allow the client to parse the stream into distinct files, and flushing after each write ensures that the data is pushed to the client without unnecessary delay. This method keeps memory use low and improves perceived performance, as users begin receiving data as soon as it’s available.
4. Reactive Streaming With WebFlux
While Spring MVC efficiently streams files, Spring WebFlux provides superior scalability through non-blocking, backpressure-aware data handling. It streams files without blocking threads or excessive memory consumption. Although the core sequential streaming concepts remain, WebFlux implements them using reactive types like Flux and Mono instead of InputStream and OutputStream.
4.1. Streaming File Uploads
In WebFlux, we handle uploads by processing the multipart request as a reactive stream of Part objects. The key is to use the native FilePart interface, which provides the file content as a Flux<DataBuffer>. This allows us to process data chunks as they arrive over the network and write them to their destination using non-blocking I/O operations, maintaining the reactive chain from the network socket all the way to the disk:
@PostMapping(value = "/upload", consumes = MediaType.MULTIPART_FORM_DATA_VALUE)
@ResponseBody
public Mono<String> uploadFileStreaming(@RequestPart("filePart") FilePart filePart) {
return Mono.fromCallable(() -> {
Path targetPath = UPLOAD_DIR.resolve(filePart.filename());
Files.createDirectories(targetPath.getParent());
return targetPath;
}).flatMap(targetPath ->
filePart.transferTo(targetPath)
.thenReturn("Upload successful: " + filePart.filename())
);
}
This creates a non-blocking pipeline where FilePart.transferTo() internally handles the reactive streaming from request to the filesystem. The process is backpressure-aware, automatically regulating data flow to match disk speed and prevent server overload.
4.2. Streaming File Downloads
For downloads, WebFlux allows us to return the file content as a Flux<DataBuffer>, which Spring writes directly to the HTTP response socket. This approach streams the file to the client incrementally, without ever loading the entire content into memory. It’s the reactive equivalent of MVC’s StreamingResponseBody and is incredibly efficient for serving large assets:
@GetMapping(value = "/download", produces = "multipart/mixed")
public ResponseEntity<Flux<DataBuffer>> downloadFiles() {
String boundary = "filesBoundary";
List<Path> files = List.of(
UPLOAD_DIR.resolve("file1.txt"),
UPLOAD_DIR.resolve("file2.txt")
);
// Use concatMap to ensure files are streamed one after another, sequentially.
Flux<DataBuffer> fileFlux = Flux.fromIterable(files)
.concatMap(file -> {
String partHeader = "--" + boundary + "\r\n" +
"Content-Type: application/octet-stream\r\n" +
"Content-Disposition: attachment; filename=\"" + file.getFileName() + "\"\r\n\r\n";
Flux<DataBuffer> fileContentFlux = DataBufferUtils.read(file, new DefaultDataBufferFactory(), 4096);
DataBuffer footerBuffer = new DefaultDataBufferFactory().wrap("\r\n".getBytes());
// Build the flux for this specific part: header + content + footer
return Flux.concat(
Flux.just(new DefaultDataBufferFactory().wrap(partHeader.getBytes())),
fileContentFlux,
Flux.just(footerBuffer)
);
})
// After all parts, concat the final boundary
.concatWith(Flux.just(
new DefaultDataBufferFactory().wrap(("--" + boundary + "--\r\n").getBytes())
));
return ResponseEntity.ok()
.header(HttpHeaders.CONTENT_TYPE, "multipart/mixed; boundary=" + boundary)
.body(fileFlux);
}
Crucially, concatMap() ensures truly sequential streaming by processing one file’s entire Flux before starting the next, preserving the multipart order. This is combined with the efficiency of DataBufferUtils.read(), which streams file content in 4KB chunks using non-blocking I/O. The result is that the entire file is never loaded into memory, clients receive data immediately, and memory usage remains minimal.
5. Conclusion
Sequential streaming in Spring lets us handle large file transfers without draining memory or delaying processing. Whether we use StreamingResponseBody in MVC or Flux<Part> in WebFlux, the key is to process data as it arrives.
For small files, the default buffered approach works fine. But when we deal with multi-GB datasets, large archives, or real-time uploads, streaming gives us lower latency, predictable memory use, and better scalability.
The code backing this article is available on GitHub. Once you're logged in as a Baeldung Pro Member, start learning and coding on the project.
















