Handling big files has always been challenging. We need to cope with the limited storage medium capacity as well as restrictions on the web services transfer.
In this tutorial, we’ll learn how to split files, create archives right away in parts, and then restore the original content.
2. Splitting Binary File
We can divide a file with split. So as an example, let’s split the ubuntu-22.04-desktop-amd64.iso file:
$ split -d -b 500M ubuntu-22.04-desktop-amd64.iso ubuntu_parts/ubuntu_piece_
Let’s notice that we use the b option to set the size of parts to 500 MB and the d option to create numerical suffixes. Furthermore, we point out the folder for results by simply adding it to the prefix ubuntu_parts/ubuntu_pieces_. Of course, this folder must exist. Now, let’s check the result:
$ ls -sh1 ubuntu_parts total 3.5G 500M ubuntu_piece_00 500M ubuntu_piece_01 501M ubuntu_piece_02 500M ubuntu_piece_03 500M ubuntu_piece_04 501M ubuntu_piece_05 486M ubuntu_piece_06
Then, let’s combine the pieces back into the iso file:
$ cat ubuntu_parts/* > ubuntu_again.iso
Let’s emphasize that cat preserves the order of the parts as it processes input files alphabetically. Moreover, cat accepts both text and binary files.
Afterward, we can verify the MD5 checksum with md5sum to make sure that the file is correctly restored:
$ md5sum ubuntu_again.iso 7621da10af45a031ea9a0d1d7fea9643 ubuntu_again.iso $ md5sum ubuntu-22.04-desktop-amd64.iso 7621da10af45a031ea9a0d1d7fea9643 ubuntu-22.04-desktop-amd64.iso
3. Using zip
Instead of splitting the existing file, we can create the split archive right away. So, let’s use zip, which since version 3.0 supports this kind of archive. Then, let’s pass the maximum size of the created file with the s option. Furthermore, as units, we can use k, m, g, and t for kilobytes, megabytes, gigabytes, and terabytes, respectively. The default unit is megabytes.
So, let’s prepare a backup of the ~/Picture folder in 180-megabyte parts:
$ zip -r -s 180 Pictures_backup.zip ~/Pictures
Now let’s examine the result files:
$ ls -sh1 Pictures_backup.* 180M Pictures_backup.z01 180M Pictures_backup.z02 180M Pictures_backup.z03 180M Pictures_backup.z04 180M Pictures_backup.z05 7.4M Pictures_backup.zip
So, let’s notice that zip enumerates files on its own, using extensions like z01, z02, and so on. Moreover, we shouldn’t change that, as we’d make extracting the archive impossible.
4. unzip Multi-File Archive
To extract the zipped multi-file archive, we need to ‘fix’ it first. So, let’s run zip with the FF option on the ‘head’ file of the archive. In addition, we need to use the out option:
$ zip -FF Pictures_backup.zip --out Pictures_single.zip
Next, we can unzip the result file Pictures_single.zip here to a temporary directory indicated by the d option:
unzip Pictures_single.zip -d ./temp
Finally, let’s notice that if we’ve split an ordinary one-part zip archive with split, we don’t need to ‘fix’ it. Then all we should do is join parts with cat and subsequently unzip the result.
5. Dealing With Missing Parts
In the case when some parts of the multipart zip archive are missing, we can still fix it with the FF option to zip. Let’s notice that we can use the ‘head’ Pictures_backup.zip file as an argument, even if it’s the missing one. Next, while the command is running, we’re going to come across the menu:
Could not find: Pictures_backup.z06 Hit c (change path to where this split file is) s (skip this split) q (abort archive - quit) e (end this archive - no more splits) z (look for .zip split - the last split) or ENTER (try reading this split again):
We should react accordingly. Usually, when the ‘head’ file is lacking, zip asks for a redundant file. In this example, it’s Pictures_backup.z06. Thus, we need to tell when to end. On the other hand, we should skip a missing component file.
Finally, we’ll unzip the created one-file zip file.
In this article, we learned to split a file into parts and join pieces back together. Next, we described how to apply popular compressing utilities, zip and unzip, to handle split archives.