Clone Only the Space In-Use from a Disk

1. Overview

Cloning the entire disk requires a huge amount of free space on a disk. A raw clone of a disk includes the actual content and the free space on the disk. In order to overcome this issue, we can easily clone just the part of the disk that is already in use.

In this article, we’ll cover two different methods: the sparse file method and the compressed file method. Both of these approaches have their pros and cons, which we’ll discuss as we go through them.

2. First Steps

We’ll try to save as much as we can. For instance, if we have a 1 TB disk and only 200 GB of it is in use, then we can save at least 800 GB by creating a sparse file, which uses the file system space more efficiently. We can further compress the file to achieve a more space-efficient file.

However, if a huge amount of space is in use, then we can clone the used space into multiple smaller files. This, of course, will take a lot of time and generate high I/O.

One more thing to note is that our source drive should be healthy and allow for overwriting the empty space.

2.1. Disk Overview

Prior to cloning our drive, we should get a good grasp of how our disk is laid out:

$ lsblk
NAME   MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS
sda      8:0    0 119.2G  0 disk 
├─sda1   8:1    0   128M  0 part 
└─sda2   8:2    0 119.1G  0 part

2.2. Zeroing Unused Blocks

Before cloning our drive using the sparse file or the compression method, we should overwrite the empty space with zeroes. That way, we’ll discard the unused blocks of our disk partition.

Let’s mount our partitions first:

$ mount -o rw /dev/sda2 /

$ mount -o rw /dev/sda1 /boot/efi

Now, we can use dd to zero-fill the empty space of our partitions:

$ dd if=/dev/zero of=/zero_file bs=32M

$ dd if=/dev/zero of=/boot/efi/zero_file bs=32M

Next, we’ll sync the storage:

$ sync

2.3. Clean Up

After we’re done, we’ll remove the zero_file files and unmount the partitions:

$ rm /zero_file /efi/boot/zero_file

$ umount /dev/sda1 /dev/sda2

3. Sparse File Method

The sparse file method can be used only if the target file system supports sparse files.

We can easily create a sparse image file using dd by setting its conv option to sparse:

$ dd if=/dev/sda of=/mnt/external/sda.img bs=512 conv=sparse

The resulting sparse file will contain only the actual files stored on our source disk — excluding the empty space. Therefore, the size of the image will be smaller. Depending on the size of the contents on the disk, it might take a long time to complete.

Once it’s done, we can sync the storage:

$ sync

3.1. Restoring the Image

Similarly, we can restore our image using dd:

$ dd if=/mnt/external/sda.img of=/dev/sda bs=512

3.2. Pros and Cons

One of the pros of using this method is that we can mount our image file for use. Not only that, but we can also access the files within the image using tools like kpartx.

On the other hand, for sparse files, we should always use the sparse option when copying the files:

$ cp --sparse=always sda.img /path

4. Compressed File Method

We can also use dd to clone the used space and compress it using tools like gzip:

$ dd if=/dev/sda bs=32M status=progress | gzip -c > /mnt/external/sda.img.gz

Let’s break it down:

we provided /dev/sda as an input to the dd command
we set the bs option to 32M to ensure a 32 MiB buffer
status=progress enables the progress monitor
we piped the output of the dd command to gzip, which would then write the contents to our image file

Mind that we have to run the commands as root.

Moreover, for gzip, we can use the –fast or -1 option for faster compression. Similarly, for better compression, we can use the –best or -9 option. Additionally, if we want to speed things up, we can also take a look at pigz.

4.1. Restoring the Image

Similarly, we can write the image back to our disk:

$ gzip -cd < /mnt/external/sda.img.gz | dd of=/dev/sda bs=32M

-cd options will decompress the image to the standard output
the contents of the image are piped to the dd command as an input

4.2. Pros and Cons

The resulting compressed file will have a much smaller size if the contents on our source disk are prone to compression. Apart from that, the file requires no special treatment as opposed to using a sparse file.

On the contrary, we can’t easily access the files inside the archive without doing a full decompression.

5. Conclusion

In this article, we discussed how we could clone the space in use on our disk. First, we saw how we could zero-fill our partitions before making a clone out of our disk. Afterward, we went through the sparse file method to achieve a sparse file. Finally, we covered the compressed file method through the use of gzip.

Learn Java Collections

Learn Spring

Learn Maven

View All Courses

Administration

Scripting

Networking

Files

Processes

Full Archive

About Baeldung