eBook – Guide Spring Cloud – NPI EA (cat=Spring Cloud)
announcement - icon

Let's get started with a Microservice Architecture with Spring Cloud:

>> Join Pro and download the eBook

eBook – Mockito – NPI EA (tag = Mockito)
announcement - icon

Mocking is an essential part of unit testing, and the Mockito library makes it easy to write clean and intuitive unit tests for your Java code.

Get started with mocking and improve your application tests using our Mockito guide:

Download the eBook

eBook – Java Concurrency – NPI EA (cat=Java Concurrency)
announcement - icon

Handling concurrency in an application can be a tricky process with many potential pitfalls. A solid grasp of the fundamentals will go a long way to help minimize these issues.

Get started with understanding multi-threaded applications with our Java Concurrency guide:

>> Download the eBook

eBook – Reactive – NPI EA (cat=Reactive)
announcement - icon

Spring 5 added support for reactive programming with the Spring WebFlux module, which has been improved upon ever since. Get started with the Reactor project basics and reactive programming in Spring Boot:

>> Join Pro and download the eBook

eBook – Java Streams – NPI EA (cat=Java Streams)
announcement - icon

Since its introduction in Java 8, the Stream API has become a staple of Java development. The basic operations like iterating, filtering, mapping sequences of elements are deceptively simple to use.

But these can also be overused and fall into some common pitfalls.

To get a better understanding on how Streams work and how to combine them with other language features, check out our guide to Java Streams:

>> Join Pro and download the eBook

eBook – Jackson – NPI EA (cat=Jackson)
announcement - icon

Do JSON right with Jackson

Download the E-book

eBook – HTTP Client – NPI EA (cat=Http Client-Side)
announcement - icon

Get the most out of the Apache HTTP Client

Download the E-book

eBook – Maven – NPI EA (cat = Maven)
announcement - icon

Get Started with Apache Maven:

Download the E-book

eBook – Persistence – NPI EA (cat=Persistence)
announcement - icon

Working on getting your persistence layer right with Spring?

Explore the eBook

eBook – RwS – NPI EA (cat=Spring MVC)
announcement - icon

Building a REST API with Spring?

Download the E-book

Course – LS – NPI EA (cat=Jackson)
announcement - icon

Get started with Spring and Spring Boot, through the Learn Spring course:

>> LEARN SPRING
Course – RWSB – NPI EA (cat=REST)
announcement - icon

Explore Spring Boot 3 and Spring 6 in-depth through building a full REST API with the framework:

>> The New “REST With Spring Boot”

Course – LSS – NPI EA (cat=Spring Security)
announcement - icon

Yes, Spring Security can be complex, from the more advanced functionality within the Core to the deep OAuth support in the framework.

I built the security material as two full courses - Core and OAuth, to get practical with these more complex scenarios. We explore when and how to use each feature and code through it on the backing project.

You can explore the course here:

>> Learn Spring Security

Course – LSD – NPI EA (tag=Spring Data JPA)
announcement - icon

Spring Data JPA is a great way to handle the complexity of JPA with the powerful simplicity of Spring Boot.

Get started with Spring Data JPA through the guided reference course:

>> CHECK OUT THE COURSE

Partner – Moderne – NPI EA (cat=Spring Boot)
announcement - icon

Refactor Java code safely — and automatically — with OpenRewrite.

Refactoring big codebases by hand is slow, risky, and easy to put off. That’s where OpenRewrite comes in. The open-source framework for large-scale, automated code transformations helps teams modernize safely and consistently.

Each month, the creators and maintainers of OpenRewrite at Moderne run live, hands-on training sessions — one for newcomers and one for experienced users. You’ll see how recipes work, how to apply them across projects, and how to modernize code with confidence.

Join the next session, bring your questions, and learn how to automate the kind of work that usually eats your sprint time.

Course – LJB – NPI EA (cat = Core Java)
announcement - icon

Code your way through and build up a solid, practical foundation of Java:

>> Learn Java Basics

Partner – LambdaTest – NPI EA (cat= Testing)
announcement - icon

Distributed systems often come with complex challenges such as service-to-service communication, state management, asynchronous messaging, security, and more.

Dapr (Distributed Application Runtime) provides a set of APIs and building blocks to address these challenges, abstracting away infrastructure so we can focus on business logic.

In this tutorial, we'll focus on Dapr's pub/sub API for message brokering. Using its Spring Boot integration, we'll simplify the creation of a loosely coupled, portable, and easily testable pub/sub messaging system:

>> Flexible Pub/Sub Messaging With Spring Boot and Dapr

1. Introduction

In this tutorial, we’ll explore different Java libraries that we can use to extract tar archives. The tar format originated as a Unix-based utility to package files together, uncompressed. But today, it’s very common to compress tar archives with gzip. So, we’ll see how compressed vs. uncompressed tar archives affect our code.

2. Creating a Base Class for Implementations

To avoid boilerplate, let’s start with an abstract class we’ll use as the basis for our implementations. This class will define a single abstract method, untar(), which will perform the extraction:

public abstract class TarExtractor {

    private InputStream tarStream;
    private boolean gzip;
    private Path destination;

    // ...

    public abstract void untar() throws IOException;
}

Now, let’s define a couple of constructors for our base class. The primary constructor will receive a tar archive as an InputStream, whether the contents are compressed, and a Path to where the files will be extracted:

protected TarExtractor(InputStream in, boolean gzip, Path destination) throws IOException {
    this.tarStream = in;
    this.gzip = gzip;
    this.destination = destination;

    Files.createDirectories(destination);
}

Most importantly, we create the base directory structure for the files we’re extracting with Files.createDirectories(). This way, we don’t need to create the destination folder ourselves. For the sake of simplicity, we’re using a boolean to define if our archive is using gzip or not. So, we don’t need to write code to detect the actual file type by its contents.

Then, in our second constructor, we’ll accept a Path to a tar archive and determine if it’s compressed based on the file name. Note that this relies on the file name being correct:

protected TarExtractor(Path tarFile, Path destination) throws IOException {
    this(Files.newInputStream(tarFile), tarFile.endsWith("gz"), destination);
}

Finally, to simplify tests, we’ll create a class with a method that returns a tar archive from our resources folder:

public interface Resources {
    
    static InputStream tarGzFile() {
        return Resources.class.getResourceAsStream("/untar/test.tar.gz");
    }
}

This can be any tar archive compressed with gzip. We just put it in a method to avoid “stream closed” errors.

3. Extraction Using Apache Commons Compression

In our first implementation, we’ll use the Apache Commons library commons-compress:

<dependency>
    <groupId>org.apache.commons</groupId>
    <artifactId>commons-compress</artifactId>
    <version>1.23.0</version>
</dependency>

The solution involves instantiating a TarArchiveInputStream, which will receive our archive stream. Then, we need to wrap it inside a GzipCompressorInputStream if using gzip:

public class TarExtractorCommonsCompress extends TarExtractor {

    protected TarExtractorCommonsCompress(InputStream in, boolean gzip, Path destination) throws IOException {
        super(in, gzip, destination);
    }

    public void untar() throws IOException {
        try (BufferedInputStream inputStream = new BufferedInputStream(getTarStream());
          TarArchiveInputStream tar = new TarArchiveInputStream(
          isGzip() ? new GzipCompressorInputStream(inputStream) : inputStream)) {
            ArchiveEntry entry;
            while ((entry = tar.getNextEntry()) != null) {
                Path extractTo = getDestination().resolve(entry.getName());
                if (entry.isDirectory()) {
                    Files.createDirectories(extractTo);
                } else {
                    Files.copy(tar, extractTo);
                }
            }
        }
    }
}

First, we iterate over our TarArchiveInputStream. For this, we must check if getNextEntry() returns an ArchiveEntry. Then, if it’s a directory, we create it relative to our destination folder. This way, we don’t get an error when writing a file inside it. Otherwise, we use Files.copy() from our tar to where we want to extract it.

Let’s test it by extracting the archive contents into an arbitrary folder:

@Test
public void givenTarGzFile_whenUntar_thenExtractedToDestination() throws IOException {
    Path destination = Paths.get("/tmp/commons-compress-gz");

    new TarExtractorCommonsCompress(Resources.tarGzFile(), true, destination).untar();

    try (Stream files = Files.list(destination)) {
        assertTrue(files.findFirst().isPresent());
    }
}

If our archive weren’t using gzip, we’d only need to pass false when instantiating our TarExtractorCommonsCompress object. Also, note that GzipCompressorInputStream can extract formats other than gzip.

4. Extraction Using Apache Ant

With Apache ant, we can get close to a core Java implementation, as we can use GZIPInputStream from java.util in case our archive is using gzip:

<dependency>
    <groupId>org.apache.ant</groupId>
    <artifactId>ant</artifactId>
    <version>1.10.13</version>
</dependency>

We’ll have a very similar implementation:

public class TarExtractorAnt extends TarExtractor {

    // standard delegate constructor

    public void untar() throws IOException {
        try (TarInputStream tar = new TarInputStream(new BufferedInputStream(
          isGzip() ? new GZIPInputStream(getTarStream()) : getTarStream()))) {
            TarEntry entry;
            while ((entry = tar.getNextEntry()) != null) {
                Path extractTo = getDestination().resolve(entry.getName());
                if (entry.isDirectory()) {
                    Files.createDirectories(extractTo);
                } else {
                    Files.copy(tar, extractTo);
                }
            }
        }
    }
}

The logic is the same here, but we use TarInputStream and TarEntry from Apache Ant instead of TarArchiveInputStream and ArchiveEntry. We can test it the same way as the previous solution:

@Test
public void givenTarGzFile_whenUntar_thenExtractedToDestination() throws IOException {
    Path destination = Paths.get("/tmp/ant-gz");

    new TarExtractorAnt(Resources.tarGzFile(), true, destination).untar();

    try (Stream files = Files.list(destination)) {
        assertTrue(files.findFirst().isPresent());
    }
}

5. Extraction Using Apache VFS

In our last example, we’ll use Apache commons-vfs2, which supports different file system schemes with a single API. One of them is tar archives:

<dependency>
    <groupId>org.apache.commons</groupId>
    <artifactId>commons-vfs2</artifactId>
    <version>2.9.0</version>
</dependency>

But, since we’re reading from an input stream, we’ll first need to save our stream to a temp file so we can generate a URI afterward:

public class TarExtractorVfs extends TarExtractor {

    // standard delegate constructor

    public void untar() throws IOException {
        Path tmpTar = Files.createTempFile("temp", isGzip() ? ".tar.gz" : ".tar");
        Files.copy(getTarStream(), tmpTar);

        // ...

        Files.delete(tmpTar);
    }
}

We’ll delete our temp file at the end of our extraction. Next, we’ll get an instance of a FileSystemManager and resolve our file URI into a FileObject, which we’ll then use to iterate over our archive contents:

FileSystemManager fsManager = VFS.getManager();
String uri = String.format("%s:file://%s", isGzip() ? "tgz" : "tar", tmpTar);
FileObject tar = fsManager.resolveFile(uri);

Note that, for resolveFile(), we construct our URI differently if we’re using gzip, prefixing it with “tgz” (which means tar+gzip) instead of “tar”. Then, at last, we iterate over our archive contents, extracting each file:

for (FileObject entry : tar) {
    Path extractTo = Paths.get(
      getDestination().toString(), entry.getName().getPath());

    if (entry.isReadable() && entry.getType() == FileType.FILE) {
        Files.createDirectories(extractTo.getParent());

        try (FileContent content = entry.getContent(); 
          InputStream stream = content.getInputStream()) {
            Files.copy(stream, extractTo);
        }
    }
}

And, because we might receive our items out of order, we’ll check if our entry is a file and call createDirectories() on its parent. This way, we don’t risk creating a file before creating its directory. Lastly, since the entry path is returned with a leading slash, we won’t use Paths.resolve() to create our destination files, like in previous implementations. Let’s test it:

@Test
public void givenTarGzFile_whenUntar_thenExtractedToDestination() throws IOException {
    Path destination = Paths.get("/tmp/vfs-gz");

    new TarExtractorVfs(Resources.tarGzFile(), true, destination).untar();

    try (Stream files = Files.list(destination)) {
        assertTrue(files.findFirst().isPresent());
    }
}

This solution is only helpful if we already use VFS in our project, as it requires a little more code.

6. Conclusion

In this article, we learned how to extract tar archives using different libraries. Our implementations extended from a base class, reducing our code and making them simpler to use.

The code backing this article is available on GitHub. Once you're logged in as a Baeldung Pro Member, start learning and coding on the project.
Baeldung Pro – NPI EA (cat = Baeldung)
announcement - icon

Baeldung Pro comes with both absolutely No-Ads as well as finally with Dark Mode, for a clean learning experience:

>> Explore a clean Baeldung

Once the early-adopter seats are all used, the price will go up and stay at $33/year.

eBook – HTTP Client – NPI EA (cat=HTTP Client-Side)
announcement - icon

The Apache HTTP Client is a very robust library, suitable for both simple and advanced use cases when testing HTTP endpoints. Check out our guide covering basic request and response handling, as well as security, cookies, timeouts, and more:

>> Download the eBook

eBook – Java Concurrency – NPI EA (cat=Java Concurrency)
announcement - icon

Handling concurrency in an application can be a tricky process with many potential pitfalls. A solid grasp of the fundamentals will go a long way to help minimize these issues.

Get started with understanding multi-threaded applications with our Java Concurrency guide:

>> Download the eBook

eBook – Java Streams – NPI EA (cat=Java Streams)
announcement - icon

Since its introduction in Java 8, the Stream API has become a staple of Java development. The basic operations like iterating, filtering, mapping sequences of elements are deceptively simple to use.

But these can also be overused and fall into some common pitfalls.

To get a better understanding on how Streams work and how to combine them with other language features, check out our guide to Java Streams:

>> Join Pro and download the eBook

eBook – Persistence – NPI EA (cat=Persistence)
announcement - icon

Working on getting your persistence layer right with Spring?

Explore the eBook

Course – LS – NPI EA (cat=REST)

announcement - icon

Get started with Spring Boot and with core Spring, through the Learn Spring course:

>> CHECK OUT THE COURSE

Partner – Moderne – NPI EA (tag=Refactoring)
announcement - icon

Modern Java teams move fast — but codebases don’t always keep up. Frameworks change, dependencies drift, and tech debt builds until it starts to drag on delivery. OpenRewrite was built to fix that: an open-source refactoring engine that automates repetitive code changes while keeping developer intent intact.

The monthly training series, led by the creators and maintainers of OpenRewrite at Moderne, walks through real-world migrations and modernization patterns. Whether you’re new to recipes or ready to write your own, you’ll learn practical ways to refactor safely and at scale.

If you’ve ever wished refactoring felt as natural — and as fast — as writing code, this is a good place to start.

eBook Jackson – NPI EA – 3 (cat = Jackson)
2 Comments
Oldest
Newest
Inline Feedbacks
View all comments