Over the last few years, a rapid growth has been witnessed in the number of digital photos produced per year. This rapid process poses challenges in the organization and management of multimedia collections, and one viable solution consists of arranging the media on the basis of the underlying events. However, album-level annotation and the presence of irrelevant pictures in photo collections make event-based organization of personal photo albums a more challenging task. To tackle these challenges, in contrast to conventional approaches relying on supervised learning, we propose a pipeline for event recognition in personal photo collections relying on a multiple instance-learning (MIL) strategy. MIL is a modified form of supervised learning and fits well for such applications with weakly labeled data. The experimental evaluation of the proposed approach is carried out on two large-scale datasets including a self-collected and a benchmark dataset. On both, our approach significantly outperforms the existing state-of-the-art.
Generally, we expect images to be an honest reflection of reality. However, this assumption is undermined by the new image editing technology, which allows for easy manipulation and distortion of digital contents. Our understanding of the implications related to the use of a manipulated data is lagging behind. In this paper we propose to exploit crowdsourcing tools in order to analyze the impact of different types of manipulation on users’ perceptions of deception. Our goal is to gain significant insights about how different types of manipulations impact users’ perceptions and how the context in which a modified image is used influences human perception of image deceptiveness. Through an extensive crowdsourcing user study, we aim at demonstrating that the problem of predicting user-perceived deception can be approached by automatic methods. Analysis of results collected on Amazon Mechanical Turk platform highlights how deception is related to the level of modifications applied to the image and to the context within modified pictures are used. To the best of our knowledge, this work represents the first attempt to address to the image editing debate using automatic approaches and going beyond investigation of forgeries.
Automatic video analysis and understanding has become a high interest research topic, with applications to video browsing, content-based video indexing, and visual surveillance. However, the automation of this process is still a challenging task, due to clutters produced by low-level processing operations. This common problem can be solved by embedding signi cant contextual information into the data, as well as using simple syntactic approaches to perform the matching between actual sequences and models. In this context we propose a novel framework that employs a symbolic representation of complex activities through sequences of atomic actions based on a weighted Context-Free Grammar.
Proc. SPIE. 8667, Multimedia Content and Mobile Devices
KEYWORDS: Content based image retrieval, Image fusion, Visualization, Image segmentation, Image processing, Feature extraction, Image retrieval, Information science, Information visualization, RGB color model
Diversification of retrieval results allows for better and faster search. Recently there has been proposed different
methods for diversification of image retrieval results mainly utilizing text information and techniques imported from natural language processing domain. However, images contain visual information that is impossible to describe in text and the use of visual features is inevitable. Visual saliency is information about the main object of an image implicitly included by humans while creating visual content. For this reason it is naturally to exploit this information for the task of diversification of the content. In this work we study whether visual saliency can be used for the task of diversification and propose a method for re-ranking image retrieval results using saliency. The evaluation has shown that the use of saliency information results in higher diversity of retrieval results.
The last few years have seen a massive increment in the use of the Internet as a channel for sharing and
transmitting data, thus requiring the need for copyright protection schemes able to preserve the ownership of
the data. The idea of embedding a watermark directly in the data is however unacceptable in various fields
of application, due to the intrinsic degradation introduced by non reversible watermarking schemes. Hence
some zero watermarking schemes have been developed. In this work we propose an optimization of a recent
watermarking method based on visual cryptography, by improving results against most commont types of attacks
and achieving a higher perceptual quality of the extracted mark.
Reversible data hiding deals with the insertion of auxiliary information into a host data without causing any permanent degradation to the original signal. In this contribution a high capacity reversible data hiding scheme, based on the classical difference expansion insertion algorithm, is presented. The method exploits a prediction stage, followed by prediction errors modification, both in the spatial domain and in the S-transform domain. Such two step embedding allows us to achieve high embedding capacity while preserving a high image quality, as demonstrated in the experimental results.
Source identification for digital content is one of the main branches of digital image forensics. It relies on the
extraction of the photo-response non-uniformity (PRNU) noise as a unique intrinsic fingerprint that efficiently
characterizes the digital device which generated the content. Such noise is estimated as the difference between
the content and its de-noised version obtained via denoising filter processing. This paper proposes a performance
comparison of different denoising filters for source identification purposes. In particular, results achieved with
a sophisticated 3D filter are presented and discussed with respect to state-of-the-art denoising filters previously
employed in such a context.
This paper presents an innovative watermarking scheme which allows the insertion of information in the Discrete
Cosine Transform (DCT) domain increasing the perceptual quality of the watermarked images by exploiting
the masking effect of the DCT coefficients. Indeed, we propose to make the strength of the embedded data
adaptive by following the characteristics of the Human Visual System (HVS) with respect to image fruition.
Improvements in the perceived quality of modified data are evaluated by means of various perceptual quality
metrics as demonstrated by experimental results.
In this paper we propose to evaluate both robustness and security of digital image watermarking techniques by
considering the perceptual quality of un-marked images in terms of Weightened PSNR. The proposed tool is based on
genetic algorithms and is suitable for researchers to evaluate robustness performances of developed watermarking
methods. Given a combination of selected attacks, the proposed framework looks for a fine parameterization of them
ensuring a perceptual quality of the un-marked image lower than a given threshold. Correspondingly, a novel metric for
robustness assessment is introduced. On the other hand, this tool results to be useful also in those scenarios where an
attacker tries to remove the watermark to overcome copyright issues. Security assessment is provided by a stochastic
search of the minimum degradation that needs to be introduced in order to obtain an un-marked version of the image as
close as possible to the given one. Experimental results show the effectiveness of the proposed approach.
In this paper a joint watermarking and ciphering scheme for digital images is presented. Both operations are
performed on a key-dependent transform domain. The commutative property of the proposed method allows to
cipher a watermarked image without interfering with the embedded signal or to watermark an encrypted image
still allowing a perfect deciphering. Furthermore, the key dependence of the transform domain increases the
security of the overall system. Experimental results show the effectiveness of the proposed scheme.
Here we introduce a novel watermarking paradigm designed to be both asymmetric, i.e., involving a private key
for embedding and a public key for detection, and commutative with a suitable encryption scheme, allowing
both to cipher watermarked data and to mark encrypted data without interphering with the detection process.
In order to demonstrate the effectiveness of the above principles, we present an explicit example where the
watermarking part, based on elementary linear algebra, and the encryption part, exploiting a secret random
permutation, are integrated in a commutative scheme.
In this paper a novel method for watermarking and ciphering color images is presented. The aim of the system is
to allow the watermarking of encrypted data without requiring the knowledge of the original data. By using this
method, it is also possible to cipher watermarked data without damaging the embedded signal. Furthermore, the
extraction of the hidden information can be performed without deciphering the cover data and it is also possible
to decipher watermarked data without removing the watermark. The transform domain adopted in this work is the Fibonacci-Haar wavelet transform. The experimental results show the effectiveness of the proposed scheme.
This paper proposes a novel data hiding scheme in which a payload is embedded into the discrete cosine transform domain. The characteristics of the Human Visual System (HVS) with respect to image fruition have been exploited to adapt the strength of the embedded data and integrated in the design of a digital image watermarking system. By using an HVS-inspired image quality metric, we study the relation between the amount of data that can be embedded and the resulting perceived quality. This study allows one to increase the robustness of the watermarked image without damaging the perceived quality, or, as alternative, to reduce the impairments produced by the watermarking process given a fixed embedding strength. Experimental results show the effectiveness and the robustness of the proposed solution.