Media players can display images embedded in audio files, which are usually album covers, artist photos, or other graphics related to the songs. The images can be stored as metadata in several audio formats, including MP3, OGG, FLAC, WMA, M4A, and MP4. Other widely used formats, such as AAC and WAV, don’t support the inclusion of images.
In this tutorial, we’ll see how to embed a cover image in the most popular audio formats.
Some media players, such as Celluloid, can display a cover without having to embed it. In fact, we only need to save it as cover.jpg in the same directory as the audio files. However, we won’t deal with this case.
2. Cover Image Requirements
There are no universal standards or guidelines for what images embedded in audio files should look like. Over time, the amount of storage available on devices has increased exponentially, from the few megabytes of early audio players to the gigabytes of today’s smartphones. As a result, support for higher resolution images in media players has also evolved.
We’ll follow the guidelines below to avoid wasting space on our devices and maintaining compatibility with older players:
- Square image
- 300×300 pixel resolution
- No larger than 500kB
- JPEG format with moderate compression
We can include much larger and heavier images if our target is only the latest high resolution devices. However, let’s keep in mind that the cover image has to be embedded in every single audio file, so the extra space we’ll use is the size of the cover image multiplied by the number of audio files.
3. Example Audio and Image Files
Let’s create the testCoverImages folder and download a sample MP3 audio file:
$ mkdir testCoverImages $ cd testCoverImages/ $ wget -O song.mp3 'https://getsamplefiles.com/download/mp3/sample-1.mp3'
Then, let’s convert it to OGG, FLAC, WMA, M4A, and MP4 formats:
$ ffmpeg -i song.mp3 song.ogg $ ffmpeg -i song.mp3 song.flac $ ffmpeg -i song.mp3 song.wma $ ffmpeg -i song.mp3 song.m4a $ ffmpeg -i song.mp3 song.mp4
We can also check that audio formats and extensions match:
$ for f in *; do file "$f"; done song.flac: FLAC audio bitstream data, 24 bit, stereo, 44.1 kHz, 4236528 samples song.m4a: ISO Media, Apple iTunes ALAC/AAC-LC (.M4A) Audio song.mp3: MPEG ADTS, layer III, v1, 320 kbps, 44.1 kHz, Stereo song.mp4: ISO Media, MP4 Base Media v1 [ISO 14496-12:2003] song.ogg: Ogg data, Vorbis audio, stereo, 44100 Hz, ~112000 bps song.wma: Microsoft ASF
Finally, let’s download a sample image and resize it to make it lighter:
$ wget -O image_hires.jpg 'https://getsamplefiles.com/download/jpg/sample-1.jpg' $ convert image_hires.jpg -resize 300x300 image.jpg $ wc -c image.jpg 21344 image.jpg
The last command shows that cover.jpg is about 20kB, so it’s small enough to have no real effect on the size of the audio files.
4. Command-Line Tools
We can use mediainfo to get technical information and other metadata about media files. For example, we can look at the audio details to see if a cover is embedded:
$ mediainfo song_with_cover.mp3 [...] File size : 1.49 MiB Duration : 1 min 36 s [...] Cover : Yes Cover MIME : image/jpeg [...]
Some options of the tools below may vary slightly from one Linux distribution to another or may not be available at all. Our test distribution is Linux Mint 21.
lame is an MP3 encoder. It can embed a cover image into an MP3 file by re-encoding it:
$ lame --ti image.jpg song.mp3 song_with_cover.mp3 LAME 3.100 64bits (http://lame.sf.net) [...] Encoding song.mp3 to song_with_cover.mp3 [...]
The following screenshot shows the result of opening song_with_cover.mp3 in Celluloid. Our very lightweight test image, only 20kB in size, looks good and full screen:
This solution isn’t fast because lame re-encodes files, and it doesn’t preserve the original audio quality. However, the difference in quality is unlikely to be noticeable because lame‘s default parameters are well optimized. Also, it’s only usable with MP3 files.
In the next examples, we won’t see Celluloid because the visual result will be identical.
The following ffmpeg command adds a cover image to an MP3 file without re-encoding the audio stream, preserving audio quality and saving time:
$ ffmpeg -i song.mp3 -i image.jpg -map_metadata 0 -map 0 -map 1 -acodec copy song_with_cover.mp3
The previous options are also good for adding covers to M4A, MP4, OGG, and WMA files:
$ ffmpeg -i song.m4a -i image.jpg -map_metadata 0 -map 0 -map 1 -acodec copy song_with_cover.m4a $ ffmpeg -i song.mp4 -i image.jpg -map_metadata 0 -map 0 -map 1 -acodec copy song_with_cover.mp4 $ ffmpeg -i song.ogg -i image.jpg -map_metadata 0 -map 0 -map 1 -acodec copy song_with_cover.ogg $ ffmpeg -i song.wma -i image.jpg -map_metadata 0 -map 0 -map 1 -acodec copy song_with_cover.wma
To add a cover to a FLAC file, we need to add the -disposition:v attached_pic option:
$ ffmpeg -i song.flac -i image.jpg -map_metadata 0 -map 0 -map 1 -acodec copy -disposition:v attached_pic song_with_cover.flac
Since there are many options required, let’s understand the meaning of each one:
- -i song.flac specifies the input audio file
- -i image.jpg specifies the cover image for the output file
- -map_metadata 0 copies the metadata from the first input file song.flac to the output file
- -map 0 includes all streams from the first input file song.flac in the output file, in this case there is only one audio stream
- -map 1 includes all streams from the second input file image.jpg in the output file, in this case the stream is a cover image
- -acodec copy specifies that the audio streams should be copied without re-encoding
- -disposition:v attached_pic sets the disposition of the video stream, which in this case refers to the image, to attached_pic, indicating that the image should be treated as an attached cover art
- song_with_cover.flac is the output file name
In these examples, we used ffmpeg to add the cover image without re-encoding the audio, as this is usually the most desirable choice. However, nothing prevents us from re-encoding the audio and converting it from one format to another.
eyeD3 displays and manipulates metadata on MP3 files. Adding a cover image is quite simple:
$ eyeD3 --add-image="image.jpg":FRONT_COVER "song.mp3" [...] Adding image image.jpg [...] FRONT_COVER Image: [Size: 21344 bytes] [Type: image/jpeg] [...]
eyeD3 doesn’t modify the audio stream in any way and has several options, including the ability to extract the embedded cover image.
mid3v2 allows us to edit the metadata of MP3 files. Adding a cover image is straightforward:
$ mid3v2 --picture image.jpg song.mp3
As we can see, the use of mid3v2 and eyeD3 to add a cover image is similar.
As the name suggests, metaflac is for viewing and editing the metadata of FLAC files. Let’s add our cover image:
$ metaflac --import-picture-from="image.jpg" "song.flac"
When it comes to audio quality, we should remember that FLAC is the only one of the formats we’ve discussed that uses lossless compression.
In this article, we’ve seen several tools for adding a cover image to an audio file from the Linux terminal: lame, ffmpeg, eyeD3, mid3v2, and metaflac.
ffmpeg is the most complete and flexible of them all. It’s the only one that supports all the audio formats, but it’s also the most difficult to use. All the other tools are simpler but specific to only one type of audio format.