How to Calculate Optimal Blocksize to Use With dd

1. Overview

In this tutorial, we’ll see how we can obtain a device’s blocksize. This is useful when we use dd or any other program that reads/writes to a storage device. We can achieve a faster speed if we choose the optimal block size.

To accomplish this, we can run dd with different block sizes and then use the fastest. Also, we can also use stat or tune2fs to get a hint of the best block size.

2. Trying Different Values and Using the Best One

Our first option is to try different values and see which one is the best. This approach is the slowest alternative but will give us the actual optimal value.

We can do this using dd to copy from /dev/zero to a temporary file. To calculate the speed, we limit the execution time using the timeout command. The block size that writes the most bytes throughout that time is the fastest.

Let’s first create our calculate_best_bs.sh script:

#!/bin/bash

IF=/dev/zero
OF=/tmp/calculate_best_bs.temp
TEST_BLOCK_SIZES=(256 512 1024 2048 4096 8192 16384 32768 65536 131072 262144)
TIMEOUT=20s
BEST_BLOCK_SIZE=0
BEST_BLOCK_SIZE_WROTE=0

for BS in "${TEST_BLOCK_SIZES[@]}"; do
    rm "$OF" >/dev/null 2>&1
    echo 3 >/proc/sys/vm/drop_caches
    timeout "$TIMEOUT" dd "if=$IF" "of=$OF" "bs=$BS" >/dev/null 2>&1
    SIZE=$(stat -c "%s" "$OF")
    if [ "$SIZE" -gt "$BEST_BLOCK_SIZE_WROTE" ]; then
        BEST_BLOCK_SIZE_WROTE="$SIZE"
        BEST_BLOCK_SIZE="$BS"
    fi
done
rm "$OF" >/dev/null 2>&1
echo Fastest Block Size: $BEST_BLOCK_SIZE

Note we need to run this script as a root, so it can drop the disk cache before each run. Also, we may want to modify the output file $OF so the script writes to the desired filesystem.

The script has an array of block sizes we want to test, and also a timeout of 20 seconds. We can limit the time it takes the script to run by adding or removing block size values, and by changing the timeout.

Let’s run it and see what we get:

$ ./calculate_best_bs.sh
Fastest Block Size: 32768

3. Using the stat Command

Alternatively, we can use the stat command on any file or directory and it will print the filesystem’s block size. Note this is just a hint, and it may not be the actual fastest one:

$ stat /tmp/inputfile.bin
  File: /tmp/inputfile.bin
  Size: 16350           Blocks: 32         IO Block: 4096   regular file
Device: fd03h/64771d    Inode: 20971538    Links: 1
Access: (0644/-rw-r--r--)  Uid: ( 1000/    nico)   Gid: (  100/   users)
Access: 2021-02-13 19:24:01.682329376 -0300
Modify: 2021-02-13 19:24:01.682329376 -0300
Change: 2021-02-13 19:24:15.581233839 -0300
 Birth: 2021-02-13 19:24:15.581233839 -0300

As we can see, it printed the value IO Block: 4096. That is just the block size of the storage where the file (or directory) is located.

stat accepts a parameter -c (or –format) that allows us to specify what information we want. If we use the parameter -c %o, stat will print only the block size.

Let’s improve the previous line storing the block size in a variable called BLOCKSIZE:

$ BLOCKSIZE=$(stat -c "%o" /tmp/inputfile.bin)
$ echo $BLOCKSIZE
4096

Finally, we can use what we’ve just learned to write a small script to use dd with the block size the previous command found. Let’s make a function that takes the input and output files as arguments and prints the parameters needed for dd:

$ dd_with_bs() {
    OUT_DIR=$(dirname "$2")
    if [ ! -e "$1" -o ! -e "$OUT_DIR" ]; then
        echo "$1 or $OUT_DIR doesn't exist"
        return 1
    fi
    IN_BS=$(stat -c "%o" "$1")
    OUT_BS=$(stat -c "%o" "$OUT_DIR")
    echo dd \"if=$1\" \"of=$2\" \"ibs=$IN_BS\" \"obs=$OUT_BS\"
}

Notice we used dirname in case the output file doesn’t exist yet. Let’s now run our new function:

$ dd_with_bs /tmp/inputfile.bin /tmp/outputfile.bin
dd "if=/tmp/inputfile.bin" "of=/tmp/outputfile.bin" "ibs=4096" "obs=4096"

We may want to complete the dd command with more parameters using the previous output as a template.

4. Using tune2fs

Finally, we can also get the partition’s block size running tune2fs -l <partition>. In this case, we also need to have root privileges, and the filesystem has to be ext3 or ext4.

Let’s try running it against /dev/sda1 and using grep to filter the information:

$ tune2fs -l /dev/sda1 | grep "Block size:"
Block size:               4096

Also, we can use awk to obtain just the value. This can be useful when we want to run a oneliner command or to store the value in a variable. Let’s store the value in a variable called BLOCKSIZE:

$ BLOCKSIZE=$(tune2fs -l /dev/sda1 | awk '/Block size:/{print $3}')
$ echo $BLOCKSIZE
4096

There is an alternative to tune2fs called dumpe2fs. It is part of the same package as tune2fs, and it also needs to run with root privileges. In this case, we can use dumpe2fs -h <partition> to get the block size:

$ dumpe2fs -h /dev/sda1 | grep "Block size:"
dumpe2fs 1.46.0 (29-Jan-2020)
Block size:               4096

5. Conclusion

In this article, we saw different ways of determining the fastest block size to use with dd.

First, we wrote a script that tests different values and calculates the fastest. We also used stat and tune2fs as alternative methods to obtain a hint of which block size would be the best.

Learn Java Collections

Learn Spring

Learn Maven

View All Courses

Administration

Scripting

Networking

Files

Processes

Full Archive

About Baeldung

1. Overview

2. Trying Different Values and Using the Best One

3. Using the stat Command

4. Using tune2fs

5. Conclusion