Hostname: page-component-77f85d65b8-zzw9c Total loading time: 0 Render date: 2026-03-28T18:21:45.734Z Has data issue: false hasContentIssue false

An overview on video forensics

Published online by Cambridge University Press:  28 August 2012

Simone Milani*
Affiliation:
Politecnico di Milano, Dipartimento di Elettronica e Informazione, Milano, Italy
Marco Fontani
Affiliation:
University of Siena, Department of Information Engineering, Siena, Italy National Inter-University Consortium for Telecommunications (CNIT), Florence, Italy
Paolo Bestagini
Affiliation:
Politecnico di Milano, Dipartimento di Elettronica e Informazione, Milano, Italy
Mauro Barni
Affiliation:
University of Siena, Department of Information Engineering, Siena, Italy National Inter-University Consortium for Telecommunications (CNIT), Florence, Italy
Alessandro Piva
Affiliation:
University of Florence, Department of Electronics and Telecommunications, Florence, Italy National Inter-University Consortium for Telecommunications (CNIT), Florence, Italy
Marco Tagliasacchi
Affiliation:
Politecnico di Milano, Dipartimento di Elettronica e Informazione, Milano, Italy
Stefano Tubaro
Affiliation:
Politecnico di Milano, Dipartimento di Elettronica e Informazione, Milano, Italy
*
Corresponding author: S. Milani E-mail: milani@elet.polimi.it

Abstract

The broad availability of tools for the acquisition and processing of multimedia signals has recently led to the concern that images and videos cannot be considered a trustworthy evidence, since they can be altered rather easily. This possibility raises the need to verify whether a multimedia content, which can be downloaded from the internet, acquired by a video surveillance system, or received by a digital TV broadcaster, is original or not. To cope with these issues, signal processing experts have been investigating effective video forensic strategies aimed at reconstructing the processing history of the video data under investigation and validating their origins. The key assumption of these techniques is that most alterations are not reversible and leave in the reconstructed signal some “footprints”, which can be analyzed in order to identify the previous processing steps. This paper presents an overview of the video forensic techniques that have been proposed in the literature, focusing on the acquisition, compression, and editing operations, trying to highlight strengths and weaknesses of each solution. It also provides a review of simple processing chains that combine different operations. Anti-forensic techniques are also considered to outline the current limitations and highlight the open research issues.

Information

Type
Overview Paper
Copyright
Copyright © The Authors 2012. The online version of this article is published within an Open Access environment subject to the conditions of the Creative Commons Attribution-NonCommercial-ShareAlike license <http://creativecommons.org/licenses/by-nc-sa/3.0/>. The written permission of Cambridge University Press must be obtained for commercial re-use.
Figure 0

Fig. 1. Typical acquisition pipeline: light enters the camera through the lens, is filtered by the CFA and converted to a digital signal by the sensor. Usually, this is followed by some in-camera post-processing and compression. In some cases, the video can be projected/displayed and re-acquired with another camera, usually undergoing lighting and spatial distortions.

Figure 1

Fig. 2. A simple field weaving algorithm for video de-interlacing. This scheme uses T fields to produce a de-interlaced video of T/2 frames.

Figure 2

Fig. 3. Simplified block diagram of a conventional video codec. P computes the prediction, T the orthonormal transform, Q is the quantizer, and F is responsible of rounding and in-loop filtering.

Figure 3

Fig. 4. Original (a) and compressed (b) frames of a standard video sequence. The high compression rate is responsible for blocking artifacts.

Figure 4

Fig. 5. Histograms of DCT coefficients (c1, c2, c3) before (first row) and after (second row) quantization. The quantization step Δ(i, j) can be estimated by the gaps between consecutive peaks.

Figure 5

Fig. 6. In this example, the first six frames of the original MPEG compressed video (first row) are deleted, thus obtaining a new sequence (second row). When this sequence is re-compressed using MPEG, each GOP will contain frames that belonged to different GOPs in the original video (frames highlighted in yellow in the third row).

Figure 6

Fig. 7. Video interpolation based on line averaging, which is a field extension scheme. Compared to the method in Fig. 2, this one has the advantage of producing a final video with T frames instead of T/2, without showing the combing artifact. On the other hand, vertical resolution is halved.

Figure 7

Fig. 8. An example of near-duplicate frames of the a video.

Figure 8

Fig. 9. The ground-truth phylogeny tree for the near-duplicate set in Fig. 8.