What Is Steganography? | Baeldung on Computer Science

1. Introduction

In today’s digitally interconnected world, data security is becoming increasingly vital. It is easily noticeable that cryptography plays a bigger and bigger part in our daily online and offline habits, doing its job in ways that are not always visible. In this tutorial, we’ll discuss one such topic closely related to cryptography: Steganography, what it is, how it works, and where it’s being used.

2. What Is Steganography?

Steganography is the practice of hiding data within other pieces of data while trying to keep the presence of such data a secret. The data that is the carrier of the secret can be a video, an image, a text file, or in non-digital cases, a physical object. The same applies to secret data. Digitally, it is common to hide data inside images, as in “Digital Image Steganography”. Of course, steganography is a topic that applies not only to the digital world, although it is fair to say that it has seen big advancements on this particular front lately.

The difference between Steganography and Cryptography (if we can make such a vague comparison) lies in whether the sender is aware that a secret message exists.

In cryptography, the sole aim is to encrypt and hide the contents of the sent message. This means that we don’t care whether someone gets their hands on the encrypted ciphertext because we rely on our encrypting algorithm’s strength to keep that message from being decrypted without the correct key or other authentication.

However, in steganography, we want to keep the actual presence of a secret message secret in itself. When we include a hidden message inside a digital image, for example, we want no one to be able to tell that this specific image contains secret information without prior knowledge.

3. Steganography Techniques

Because of its historical importance and influence during the previous century, we may briefly examine some techniques used for what we call Physical Steganography. The most basic form of physical steganography is invisible ink. Messages embroiled in between the lines of a private letter or on otherwise unrelated documents. Hiding text pieces within other larger texts is also a widespread way, for example, a message, letter, or book page where the first letter of each line is the hidden code, as seen below:

Of course, spies have taken steganography to another level, especially during the cold war. As a distinct example, we mention here one case where classified information was being transmitted, being embedded as pitches of musical notes in sheet music.

Arguably, Digital Steganography nowadays is way broader and has a lot more room, not only for a larger amount of information to be transmitted but also for more complex and sophisticated ways to hide data. One which is broadly used and frankly makes the most sense is Digital Image Steganography, which we’ll also discuss a bit more in-depth in the next chapter. Image Steganography is essentially the practice of hiding information inside image files. Depending on the technique, different amounts of data can be subtly embedded in an image and with different levels of obscurity and safety against detection mechanisms.

4. Least Significant Bit of Steganography

Furthermore, to get a better understanding of how steganography works, we may take a look at a typical application, Least Significant Bit Steganography. LSB is used to hide information inside digital files, such as images.

The idea is to replace the least significant bits of the pixel values in an image with bits of a secret message. Therefore, because the human eye is not sensitive to small changes in pixel values, the resulting image looks almost identical to the original.

Here is an example of three-pixel values that are very similar. As we can see, it is tough for a human’s eye to distinguish between slight differences, more so when talking about a picture with hundreds or thousands of such micro differences:

The three numbers indicate the Red, Green, and Blue values for each pixel. The first pixel, for example, has a value of 200, 200, 30. In binary encoding, this would give:

11001000, 11001000, 00011110

The second pixel has a value of 201, 201, and 31. This would give a binary representation of the following:

11001001, 11001001, 00011111

The different values of the last bit here can be used to integrate a hidden message. With this technique, we can use the last bit out of each value, giving an output of 3 encoding bits out of a total of 24. But even if using the two last bits, the pixel values are still significantly similar, so more aggressive schemes can be viable.

Not only is it difficult to detect, but it also requires access to the original image to be able to compare the original with the tampered one. When there is no such access for a detection entity, it’s quite hard to identify the existence of hidden files through human inspection alone.

However, LSB steganography is not foolproof. Because the changes made to the pixel values are small, it’s possible for someone to uncover the presence of a hidden message using statistical analysis. Additionally, the hidden message can be destroyed if the image is modified – for example, a resize or a crop.

5. Real World Applications of Digital Steganography

Given that the main purpose of steganography is to keep the existence of data a secret, it is no wonder that one of the major use cases comes from malware.

Infecting a system with malware is a two-step process; getting the code on the system and keeping it hidden there as long as possible. Embedding the malware within email attachments, images, documents, and other files is a significant way to infect a system. Being mainstream, antimalware software tries to keep up with it by scanning these types of files and checking for the existence of known viruses.

However, if a well-written malicious program can utilize steganographic techniques to hide within another file, most antimalware and antivirus software quickly becomes useless. It’s also of no help that the state of steganography detection programs nowadays is not yet of great quality, largely because of the complexity of such a demanding task.

Another mainstream application is Digital Watermarking. Digital watermarking is a technique that uses steganography to embed hidden information into a file. This information acts as a watermark and can contain a variety of metadata as they are called, such as the name of the copyright owner, the date and time the file was created, or another type of unique identifier. The point is to have specific information inside the file that can identify its original source.

Digital watermarks are typically created using steganography techniques that modify the least significant bits of the digital data – similar to what we saw earlier. This allows the watermark to be embedded within the file without significantly altering its appearance or quality. The watermark is then typically extracted using specialized software that can detect and decode the hidden information.

The real-world use case of this comes from copyright protection. By embedding a unique watermark within a media file, the owner of the said file can trace the source of unauthorized copies that may fall to their attention. This can help to protect their rights and prevent the unauthorized use of their content – or at least discourage piracy in the first place.

6. Conclusion

Through this article, we took a look at steganography. We saw its earlier implementations before the digital age and how the technology evolved in the digital world. We took a look at Least Significant Bit steganography, a typical and widespread steganography technique, as well as some of its most mainstream real-world applications. Until next time, stay safe!

Learn Java Collections

Learn Spring

Learn Maven

View All Courses

Core Concepts

Operating Systems

Neural Networks

Graph Theory

Latex

Full Archive

About Baeldung