1. Introduction

When it comes to storage, a block is the minimum unit we can allocate or address. However, what this actually means depends on a number of involved actors and systems:

Since the synchronization between these isn’t always straightforward, we may have to check each level individually.

In this tutorial, we understand storage block sizing and get into ways to influence it. First, we go over block sizes on different levels. After that, we explore a common command to work with the size of a block at each one.

We tested the code in this tutorial on Debian 12 (Bookworm) with GNU Bash 5.2.15. It should work in most POSIX-compliant environments unless otherwise specified.

2. Block Size

Blocks are the minimal units that we can allocate and address. Storage blocks are a convenient, but fairly misleading and not strictly determined concept.

2.1. Storage Units

To begin with, let’s define some units:

+----------+------+----------+
| Name     | Unit | Value    |
+----------+------+----------+
| Bit      | b    | 1 / 0    |
| Byte     | B    | 8 b      |
+----------+------+----------+
| Kilobyte | KB   | 1000 B   |
| Kibibyte | KiB  | 1024 B   |
+----------+------+----------+
| Megabyte | MB   | 1000 KB  |
| Mebibyte | MiB  | 1024 KiB |
+----------+------+----------+
| Gigabyte | GB   | 1000 MB  |
| Gibibyte | GiB  | 1024 MiB |
+----------+------+----------+
| Terabyte | TB   | 1000 GB  |
| Tibibyte | TiB  | 1024 GiB |
+----------+------+----------+
| Petabyte | PB   | 1000 TB  |
| Pebibyte | PiB  | 1024 TiB |
+----------+------+----------+
| Exabyte  | EB   | 1000 PB  |
| Exbibyte | EiB  | 1024 PiB |
+----------+------+----------+

Although similar, one unit type is set in terms of the SI powers of 10 but the other uses powers of 2 for technical reasons. Still, the latter may be harder to comprehend by humans. Also, while the difference between KB and KiB can be an ignorable 2-3%, EB and EiB differ by more than 15% per unit.

If omitted, the unit is usually [B]ytes.

2.2. Storage Medium Block

Physical mediums have a built-in minimal unit enforced by their controller.

For instance, a modern solid-state drive (SSD) may report a logical block size of 512B. This comes from older technology, i.e., a hard disk drive (HDD) has 512B sectors. Yet, the actual standard minimal unit for an SSD is now 4KiB:

$ fdisk --list /dev/sda
Disk /dev/sda: 32 GiB, 34359738368 bytes, 67108864 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
[...]

In this case, fdisk [–list]s information about /dev/sda, where we can see the sector size in relation to the physical block. Importantly, special configurations such as RAID may impose different settings.

So, the discrepancy between the reported logical block size and the actual controller block size means that the system requests 512B chunks but the physical medium reads a 4KiB block to extract 512B sector-sized information. Although this may be a waste in many cases, a request for the next 512B chunk wouldn’t trigger further storage operations due to buffering.

2.3. Filesystem Block

When it comes to filesystems, the block size in terms of minimal allocation unit can also be called a cluster.

This is specified when formatting:

$ mke2fs -t ext4 -b 4096 /dev/sdx1

Here, we use mke2fs to format /dev/sdx1 with the ext4 filesystem with a [-b]lock size of 4096B. If we don’t explicitly add a block or [-c]luster size, mke2fs checks with its ruleset in /etc/mke2fs.conf:

$ cat /etc/mke2fs.conf
[defaults]
      base_features = sparse_super,large_file,filetype,resize_inode,dir_index,ext_attr
      default_mntopts = acl,user_xattr
      enable_periodic_fsck = 0
      blocksize = 4096
      inode_size = 256
      inode_ratio = 16384

[fs_types]
      ext3 = {
              features = has_journal
      }
      ext4 = {
              features = has_journal,extent,huge_file,flex_bg,metadata_csum,64bit,dir_nlink,extra_isize
      }
      small = {
              blocksize = 1024
              inode_ratio = 4096
      }
      floppy = {
              blocksize = 1024
              inode_ratio = 8192
      }
      big = {
              inode_ratio = 32768
      }
[...]

Here, we can see some defaults, as well as specific features per filesystem type. Further down, we can also see sizing information. The final block size depends on the filesystem type, expected usage as specified by the mke2fs -T option, as well as the size of the formatted partition. Importantly, the block size must be a power of 2.

Let’s check the block size of our current root partition:

$ tune2fs -l /dev/sda1
[...]
Block size:               4096
[...]

As expected, we have an overlap between the block size we get for the filesystem as reported by tune2fs and the storage medium block size. This isn’t a requirement but is usually the case for better performance on many systems.

2.4. Kernel Block

Regardless of the physical and filesystem specifications and limitations, the operating system (OS) kernel can request any amount of information at any time.

Still, the OS usually has different mechanisms to optimize its storage reads. Most notably, information transfers between the RAM cache and secondary storage commonly happen in pages. On the other hand, the read() system call can request any size but the minimum will always depend on the hardware, controller, and filesystem.

3. Using blockdev

As we already saw, storage block sizes mainly exist on two levels:

  • physical storage medium: physical and logical
  • virtual partition filesystem: block and cluster

Thus, having a way to check and change these settings may be beneficial. In particular, we can use the blockdev command.

It’s good practice to use –rereadpt to reread the partition table before checking data for a given partition.

3.1. General Device Information

To begin with, blockdev can tell us about the size and structure of the hardware device we have:

  • –getsize64: size in bytes
  • –getsz: size in 512B sectors
  • –getdiscardzeroes: check whether zeroes are ignored
  • –getalignoff: alignment offset in bytes

Let’s try it out:

$ blockdev --getsize64 /dev/sda
34359738368

This information overlaps with our earlier fdisk dump of /dev/sda.

3.2. Get Block Information

Of course, blockdev also offers several flags to specifically check the current block size information:

  • –getss: logical block size
  • –getpbsz: physical block size
  • –getiomin: smallest chunk per storage operation
  • –getioopt: optimal chunk per storage operation
  • –getmaxsect: max sectors per request
  • –getbsz: block size

Let’s confirm our values for the logical and physical sizes of a /dev/sda block:

$ blockdev --getss /dev/sda1
512
$ blockdev --getpbsz /dev/sda1
4096

As expected, we get the same block sizes. Since the physical medium and its controller handle 4096B blocks, the minimum we can exchange between it and RAM is that amount of bytes, not 512B:

$ blockdev --getiomin /dev/sda
4096

Of course, that also means that the current –getbsz block size is also 4096B.

3.3. Readahead Sectors

Usually and on average, many of the read and write operations over a given medium are sequential. In other words, if we read or write position X, we expect that position X+1 will be the next one read or written.

This situation is so prevalent that there is a so-called readahead feature that automatically gets a number of following storage blocks or sectors before they are requested. So, several options enable us to set and get this number on different levels:

  • –setra: set number of readahead sectors
  • –getra: get number of readahead sectors
  • –setfra: set number of filesystem readahead sectors
  • –getfra: get number of filesystem readahead sectors

While the first two options work with the storage medium, the latter two relate to the filesystem.

3.4. Buffering

Buffering is part of the readahead and caching strategy. In short, it’s a way to copy data that we expect to need shortly from secondary storage to main memory. However, we may sometimes want to drop the buffered data, so we can use –flushbufs to flush the buffers:

$ blockdev --flushbufs /dev/sda1

Naturally, buffering exists on many levels, so a complete buffer flush might require more commands.

Importantly, if we clear the buffers, we can ensure the next request for a read operation would actually cause a block to be read instead of pulling data from the cache.

4. Summary

In this article, we delved into block sizes and how to manipulate them.

In conclusion, although the storage block size can be a bit misleading, having a grasp of each related system can help troubleshoot and optimize it.