To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
The goal of steganalysis is to detect the presence of secretly embedded messages. Depending on how much information the warden has about the steganographic channel she is trying to attack, the detection problem can accept many different forms. In the previous chapter, we dealt with the situation when the warden knows the steganographic method that Alice and Bob might be using. With this knowledge, Eve can tailor her steganalysis to the particular steganographic channel using several strategies outlined in Section 10.3. If Eve has no information about the steganographic method, she needs blind steganalysis capable of detecting as wide a spectrum of steganographic methods as possible. Design and implementation of practical blind steganalysis detectors is the subject of this chapter.
The first and most fundamental step for Eve is to accept a model of cover images and represent each image using a vector of features. In contrast to targeted steganalysis, where a single feature (e.g., an estimate of message length) was often enough to construct an accurate detector, blind steganalysis by definition requires many features. This is because the role of features in blind steganalysis is significantly more fundamental – in theory they need to capture all possible patterns natural images follow so that every embedding method the prisoners can devise disturbs at least some of the features. In Section 10.4, we loosely formulated this requirement as completeness of the feature space and outlined possible strategies for constructing good features.
The definition of steganographic security given in the previous chapter should be a guiding design principle for constructing steganographic schemes. The goal is clear – to preserve the statistical distribution of cover images. Unfortunately, digital images are quite complicated objects that do not allow accurate description using simple statistical models. The biggest problem is their non-stationarity and heterogeneity. While it is possible to obtain simple models of individual small flat segments in the image, more complicated textures often present an insurmountable challenge for modeling because of a lack of data to fit an accurate local model. Moreover, and most importantly, as already hinted in Chapter 3, digital images acquired using sensors exhibit many complicated local dependences that the embedding changes may disturb and leave statistically detectable artifacts. Consequently, the lack of good image models gives space to heuristic methods.
In this chapter, we discuss four major guidelines for construction of practical steganographic schemes:
• Preserve a model of the cover source (Section 7.1);
• Make the embedding resemble some natural process (Section 7.2);
• Design the steganography to resist known steganalysis attacks (Section 7.3);
• Minimize the impact of embedding (Section 7.4).
Steganographic schemes from the first class are based on a simplified model of the cover source. The schemes are designed to preserve the model and are thus undetectable within this model.
Steganalysis is the activity directed towards detecting the presence of secret messages. Due to their complexity and dimensionality, digital images are typically analyzed in a low-dimensional feature space. If the features are selected wisely, cover images and stego images will form clusters in the feature space with minimal overlap. If the warden knows the details of the embedding mechanism, she can use this side-information and design the features accordingly. This strategy is recognized as targeted steganalysis. The histogram attack and the attack on Jsteg from Chapter 5 are two examples of targeted attacks.
Three general strategies for constructing features for targeted steganalysis were described in the previous chapter. This chapter presents specific examples of four targeted attacks on steganography in images stored in raster, palette, and JPEG formats. The first attack, called Sample Pairs Analysis, detects LSB embedding in the spatial domain by considering pairs of neighboring pixels. It is one of the most accurate methods for steganalysis of LSB embedding known today. Section 11.1 contains a detailed derivation of this attack as well as several of its variants formulated within the framework of structural steganalysis. The Pairs Analysis attack is the subject of Section 11.2. It was designed to detect steganographic schemes that embed messages in LSBs of color indices to a preordered palette. The EzStego algorithm from Chapter 5 is an example of this embedding method. Pairs Analysis is based on an entirely different principle than Sample Pairs Analysis because it uses information from pixels that can be very distant.
In the previous chapter, we learned that one of the general guiding principles for design of steganographic schemes is the principle of minimizing the embedding impact. The plausible assumption here is that it should be more difficult for Eve to detect Alice and Bob's clandestine activity if they leave behind smaller embedding distortion or “impact.” This chapter introduces a very general methodology called matrix embedding using which the prisoners can minimize the total number of changes they need to carry out to embed their message and thus increase the embedding efficiency. Even though special cases of matrix embedding can be explained in an elementary fashion on an intuitive level, it is extremely empowering to formulate it within the framework of coding theory. This will require the reader to become familiar with some basic elements of the theory of linear codes. The effort is worth the results because the reader will be able to design more secure stegosystems, acquire a deeper understanding of the subject, and realize connections to an already well-developed research field. Moreover, according to the studies that appeared in [143, 95], matrix embedding is one of the most important design elements of practical stegosystems.
As discussed in Chapter 5, in LSB embedding or ±1 embedding one pixel communicates exactly one message bit. This was the case of OutGuess as well as Jsteg.
In Chapter 6, we learned that steganographic security can be measured with the Kullback–Leibler divergence between the distributions of cover and stego images. Four heuristic principles for minimizing the divergence were discussed in Chapter 7. One of them was the principle of minimal embedding impact, which starts with the assumption that each cover element, i, can be assigned a numerical value, ρ[i], that expresses the contribution to the overall statistical detectability if that cover element was to be changed during embedding. If the values ρ[i] are approximately the same across all cover elements, minimizing the embedding impact is equivalent to minimizing the number of embedding changes. The matrix embedding methods introduced in the previous chapter can be used to achieve this goal.
If ρ[i] is highly non-uniform, Alice may attempt to restrict the embedding changes to a selection channel formed by those cover elements with small ρ[i]. Constraining the embedding process in this manner, however, brings a fundamental problem. Often, the values ρ[i] are computed from the cover image or some side-information that is not available to Bob. Thus, Bob is generally unable to determine the same selection channel from the stego image and thus read the message. Channels that are not shared between the sender and the recipient are called non-shared selection channels. The main focus of this chapter is construction of methods that enable communication with non-shared selection channels.
Steganography is another term for covert communication. It works by hiding messages in inconspicuous objects that are then sent to the intended recipient. The most important requirement of any steganographic system is that it should be impossible for an eavesdropper to distinguish between ordinary objects and objects that contain secret data.
Steganography in its modern form is relatively young. Until the early 1990s, this unusual mode of secret communication was used only by spies. At that time, it was hardly a research discipline because the methods were a mere collection of clever tricks with little or no theoretical basis that would allow steganography to evolve in the manner we see today. With the subsequent spontaneous transition of communication from analog to digital, this ancient field experienced an explosive rejuvenation. Hiding messages in electronic documents for the purpose of covert communication seemed easy enough to those with some background in computer programming. Soon, steganographic applications appeared on the Internet, giving the masses the ability to hide files in digital images, audio, or text. At the same time, steganography caught the attention of researchers and quickly developed into a rigorous discipline. With it, steganography came to the forefront of discussions at professional meetings, such as the Electronic Imaging meetings annually organized by the SPIE in San Jose, the IEEE International Conference on Image Processing (ICIP), and the ACM Multimedia and Security Workshop. In 1996, the first Information Hiding Workshop took place in Cambridge and this series of workshops has since become the premium annual meeting place to present the latest advancements in theory and applications of data hiding.
The first steganographic techniques for digital media were constructed in the mid 1990s using intuition and heuristics rather than from specific fundamental principles. The designers focused on making the embedding imperceptible rather than undetectable. This objective was undoubtedly caused by the lack of steganalytic methods that used statistical properties of images. Consequently, virtually all early naive data-hiding schemes were successfully attacked later. With the advancement of steganalytic techniques, steganographic methods became more sophisticated, which in turn initiated another wave of research in steganalysis, etc. This characteristic spiral development can be expressed through the following quotation:
Steganography is advanced through analysis.
In this chapter, we describe some very simple data-hiding methods to illustrate the concepts and definitions introduced in Chapter 4 and especially Section 4.3. At the same time, we point out problems with these simple schemes to emphasize the need for a more exact fundamental approach to steganography and steganalysis.
In Section 5.1, we start with the simplest and most common steganographic algorithm – Least-Significant-Bit (LSB) embedding. The fact that LSB embedding is not a very secure method is demonstrated in Section 5.1.1, where we present the histogram attack. Section 5.1.2 describes a different attack on LSB embedding in JPEG images that can not only detect the presence of a secret message but also estimate its size.
Some of the first steganographic methods were designed for palette images, which is the topic of Section 5.2. We discuss six different ideas for hiding information in palette images and point out their weaknesses as well as other problematic issues pertaining to their design.
This book focuses on steganographic methods that embed messages in digital images by slightly modifying them. In this chapter, we explain the process by which digital images are created. This knowledge will help us design more secure steganography methods as well as build more sensitive detection schemes (steganalysis).
Fundamentally, there exist two mechanisms through which digital images can be created. They can be synthesized on a computer or acquired through a sensor. Computer-generated images, such as charts, line drawings, diagrams, and other simple graphics generated using drawing tools, could, in principle, be made to hold a small amount of secret data by the selection of colors, object types (line type, fonts), their positions or dimensions, etc. Realistic-looking computer graphics generated from three-dimensional models (or measurements) using specialized methods, such as ray-tracing or radiosity, are typically not very friendly for steganography as they are generated by deterministic algorithms using well-defined rules. In this book, we will mostly deal with images acquired with cameras or scanners because they are far more ubiquitous than computer-generated images and provide a friendlier environment for steganography. As with any categorization, the boundary between the two image types (real versus computer-generated) is blurry. For example, it is not immediately clear how one should classify a digital-camera image processed in Photoshop to make it look like Claude Monet's style of painting or a collage of computer-generated and real images.
In the previous chapter, we saw a few examples of simple steganographic schemes and successful attacks on them. We learned that the steganographic scheme called LSB embedding leaves a characteristic imprint on the image histogram that does not occur in natural images. This observation lead to an algorithm (a detector) that could decide whether or not an image contains a secret message. The existence of such a detector means that LSB embedding is not secure. We expect that for a truly secure steganography it should be impossible to construct a detector that could distinguish between cover and stego images. Even though this statement appears reasonable at first sight, it is vague and allows subjective interpretations. For example, it is not clear what is meant by “could distinguish between cover and stego images.” We cannot construct a detector that will always be 100% correct because it is hardly possible to detect the effects of flipping one LSB, at least not reliably in every cover. Just how reliable must a detector be to pronounce a steganographic method insecure?
Even though there are no simple practical solutions to the questions raised in the previous paragraph, they can in principle be studied within the framework of information theory. Imagine that Alice and Bob are engaging in a legitimate communication and do not use steganography. Let us suppose that they exchange grayscale 512 × 512 images in raster format that were never compressed.
The main goal of steganography is to communicate secret messages without making it apparent that a secret is being communicated. This can be achieved by hiding messages in ordinary-looking objects, which are then sent in an overt manner through some communication channel. In this chapter, we look at the individual elements that define steganographic communication.
Before Alice and Bob can start communicating secretly, they must agree on some basic communication protocol they will follow in the future. In particular, they need to select the type of cover objects they will use for sending secrets. Second, they need to design the message-hiding and message-extraction algorithms. For increased security, the prisoners should make both algorithms dependent on a secret key so that no one else besides them will be able to read their messages. Besides the type of covers and the inner workings of the steganographic algorithm, Eve's ability to detect that the prisoners are communicating secretly will also depend on the size of the messages that Alice and Bob will communicate. Finally, the prisoners will send their messages through a channel that is under the control of the warden, who may or may not interfere with the communication.
We recognize the following five basic elements of every steganographic channel (see Figure 4.1):
• Source of covers,
• Data-embedding and -extraction algorithms,
• Source of stego keys driving the embedding/extraction algorithms,
• Source of messages,
• Channel used to exchange data between Alice and Bob.