1. Introduction

Whether we’re dealing with images, audio files, or video files, we must understand the difference between different types of formats. Using the wrong audio or video format may reduce the quality of the files or make the files unnecessarily large.

The most important thing that we need to understand is the difference between codecs and containers.

A codec takes raw video and audio and compresses it so that it can occupy a smaller amount of space. The output is then packaged into a specific format called a container. AVI and MP4 are examples of containers, and each codec has its properties, strengths, and weaknesses.

Keep in mind that to get a high-quality output video during conversion, the input video should also be high-quality.

In this tutorial, we’ll explore using FFmpeg to preserve the best audio and video quality possible while converting between different formats.

2. Installation

FFmpeg is one of the most popular video processing and compression libraries. It powers some of the most efficient commercial encoders and free open-source multimedia players such as the VLC player.

Before we proceed, we need to ensure we have FFmpeg available on the system:

$ ffmpeg -version
ffmpeg version 4.2.7-0ubuntu0.1 Copyright (c) 2000-2022 the FFmpeg developers
built with gcc 9 (Ubuntu 9.4.0-1ubuntu1~20.04.1)
....truncated....

In case it’s not available, we can install it from the local package manager:

$ sudo apt install ffmpeg

For Arch-based Linux distros, we can use:

$ sudo pacman -S ffmpeg

3. Choosing the Right Video and Audio Codec

Selecting the right video codec is critical for getting high-quality video output.

FFmpeg supports a variety of video codecs including MPEG-1, MPEG-2, VP8, and H.264, which is currently the most popular video codec. It offers high-quality and efficient compression.

Let’s use the ffmpeg command to convert a sample.mp4 video to AVI format using the H.264 codec:

$ ffmpeg -i sample.mp4 -c:v libx264 output.avi

We’re using the -c:v option to specify the video codec, and output.avi represents the output video file.

FFmpeg also supports several audio codecs, including MP3, AAC, and PCM. For the best audio quality, we need to use a lossless codec such as PCM. However, lossless codecs result in much larger file sizes.

Let’s convert the sample.mp4 video to AVI format and convert the audio to PCM format:

$ ffmpeg -i sample.mp4 -c:v copy -c:a pcm_s16le output.avi

The -c:v copy option copies the video stream without re-encoding it. The -c:a pcm_s16le option converts the audio stream to uncompressed PCM audio with 16-bit depth and little-endian byte order. This ensures the best audio quality possible.

Putting it all together, we can convert the sample.mp4 video file to AVI while converting audio and video codecs:

$ ffmpeg -i sample.mp4 -c:v libx264 -c:a pcm_s16le output.avi

4. Setting the Video and Audio Bitrate

The video bitrate determines the amount of data used to encode a video file. Higher bitrates generally produce higher-quality video. However, this also results in a larger output file.

When converting from AVI to MP4 or vice versa, we must choose a bitrate that’s appropriate for the intended use.

For example, when converting a video for online streaming, a lower bitrate may help reduce load times.

Let’s set a bitrate of 1200 kilobits per second (kbps) while converting a sample.mp4 file to AVI format:

$ ffmpeg -i sample.mp4 -c:v libx264 -c:a pcm_s16le -b:v 1200k output.avi

The -b:v option specifies the bitrate which in this case is 1200.

We can also adjust the audio bitrate. However, a higher audio bitrate results in better audio quality at the expense of a bigger output file.

Let’s modify the command to change the audio bitrate:

$ ffmpeg -i sample.mp4 -c:v libx264 -c:a pcm_s16le -b:v 1200k -b:a 192k output.avi

The -b:a option specifies the audio bitrate. The command above outputs the output.avi file, which uses H.264 video codec with 1200 kbps and PCM audio with 192 kbps.

5. Using the Correct Resolution and Framerate

The resolution of a video determines the number of pixels in the image. Higher resolutions generally result in a higher-quality image, but the resulting output file will be bigger.

Choosing the correct resolution is dependent on our intended use of the output video file. Lower resolutions are more appropriate for mobile devices, while higher resolutions and meant for larger screens or projectors.

Let’s set the resolution of a sample.mp4 file:

$ ffmpeg -i sample.mp4 -c:v libx264 -c:a pcm_s16le -s 640x480 output.avi

We’re using the -s option to specify the resolution as 640×480.

Framerate is the measurement of how quickly frames appear within a second. Higher framerates produce a smoother video but with a bigger size.

Let’s change the framerate of a sample.mp4 file’s conversion to AVI:
$ ffmpeg -i sample.mp4 -c:v libx264 -c:a pcm_s16le -s 640x480 -r 30 output.avi

The -r option sets the frames per second value to 30 fps.

6. Using Two-Pass Encoding

Two-pass encoding is a technique that retains the best quality of a video, while minimizing file size, during conversion. It does this by analyzing the video content in two passes.

During the first pass, FFmpeg analyzes the video content and creates a log file with information such as the number of colors, frames, and motion types. In the second pass, it uses the data collected during the first pass to achieve the best encoding quality possible.

Let’s use two-pass encoding to convert a sample.mp4 file:

$ ffmpeg -i sample.mp4 -c:v libx264 -b:v 3000k -c:a pcm_s16le -b:a 192k -pass 1 -f avi /dev/null && \
ffmpeg -i sample.mp4 -c:v libx264 -b:v 3000k -c:a pcm_s16le -b:a 192k -pass 2 output.avi

This command performs a two-pass encoding using the H.264 codec, PCM audio codec, video bitrate set to 3000 kbps, and an audio bitrate of 192 kbps.

Two-pass encoding optimizes the file size at the expense of a longer encoding time.

7. Quality Conversion Using qscale

When using ffmpeg with codecs that support constant quality-based encoding, we can use the qscale option to tune the media quality and file size. Let’s walk through a few scenarios to understand this parameter in detail.

First, let’s see how to use the qscale option with ffmpeg:

-qscale[:stream_specifier] n

We must note that providing a stream_specifier is optional. However, if we don’t provide one, ffmpeg assumes video (v) as its default value. Further, the scale factor (n) is an integer whose value typically ranges from 1 to 31.

Now, let’s use the mpeg1video codec to convert an input.mp4 media file into MPG files using qscale values of 5 and 30:

$ ffmpeg -i input.mp4 -c:v mpeg1video -qscale:v 5 output_1.mpg
$ ffmpeg -i input.mp4 -c:v mpeg1video -qscale:v 30 output_2.mpg

Next, let’s see the size of the output files using the ls command:

$ ls -lh
total 1712328
-rw-r--r--@ 1 tavasthi  staff   163M Jul 28 08:12 input.mp4
-rw-r--r--  1 tavasthi  staff   518M Jul 28 08:13 output_1.mpg
-rw-r--r--  1 tavasthi  staff   143M Jul 28 08:14 output_2.mpg

It’s important to note that with a higher qscale value, we get lower quality and smaller file sizes. As a result, we can see that the size of output_2.mpg is considerably smaller than that of the output_1.mpg file.

Moving on, let’s extract the frames from the input.mp4 and save them as JPG files using the qscale values of 5 and 20:

$ ffmpeg -i input.mp4 -qscale:v 5 scenario1_output_%03d.jpg
$ ffmpeg -i input.mp4 -qscale:v 20 scenario2_output_%03d.jpg

We must note that the “_%03d” in the filename is a placeholder that gets replaced with sequential numbers.

Further, let’s see the total number of JPG images for the two scenarios:

$ ls -lh scenario1_output_* | wc -l
   28236
$ ls -lh scenario2_output_* | wc -l
   28236

We can see that the total number of images is the same. However, we must see a difference in the file sizes:

$ ls -lh scenario1_output_* | head -2
-rw-r--r--  1 tavasthi  staff    57K Jul 28 08:51 scenario1_output_001.jpg
-rw-r--r--  1 tavasthi  staff    58K Jul 28 08:51 scenario1_output_002.jpg

$ ls -lh scenario2_output_* | head -2
-rw-r--r--  1 tavasthi  staff    23K Jul 28 08:50 scenario2_output_001.jpg
-rw-r--r--  1 tavasthi  staff    23K Jul 28 08:50 scenario2_output_002.jpg

As expected, the size of image files generated when using the qscale value of 5 is considerably larger than when we used the qscale value of 20.

8. Conclusion

In this article, we’ve explored different ways of ensuring we get the best audio or video quality while converting video files. Higher-quality videos tend to have higher bitrate, framerate, and resolution values. However, these come at the cost of occupying a much larger space.

We also used two-pass encoding, which helps retain the best quality possible of the output video by analyzing its contents in two passes.

Comments are closed on this article!