Deep learning-based algorithms have become increasingly efficient in recognition and detection tasks, especially when they are trained on large-scale datasets. Such recent success has led to a speculation that deep learning methods are comparable to or even outperform human visual system in its ability to detect and recognize objects and their features. In this paper, we focus on the specific task of gender recognition in images when they have been processed by privacy protection filters (e.g., blurring, masking, and pixelization) applied at different strengths. Assuming a privacy protection scenario, we compare the performance of state of the art deep learning algorithms with a subjective evaluation obtained via crowdsourcing to understand how privacy protection filters affect both machine and human vision.
Many privacy protection tools have been proposed for preserving privacy. Tools for protection of visual privacy available today lack either all or some of the important properties that are expected from such tools. Therefore, in this paper, we propose a simple yet effective method for privacy protection based on false color visualization, which maps color palette of an image into a different color palette, possibly after a compressive point transformation of the original pixel data, distorting the details of the original image. This method does not require any prior face detection or other sensitive regions detection and, hence, unlike typical privacy protection methods, it is less sensitive to inaccurate computer vision algorithms. It is also secure as the look-up tables can be encrypted, reversible as table look-ups can be inverted, flexible as it is independent of format or encoding, adjustable as the final result can be computed by interpolating the false color image with the original using different degrees of interpolation, less distracting as it does not create visually unpleasant artifacts, and selective as it preserves better semantic structure of the input. Four different color scales and four different compression functions, one which the proposed method relies, are evaluated via objective (three face recognition algorithms) and subjective (50 human subjects in an online-based study) assessments using faces from FERET public dataset. The evaluations demonstrate that DEF and RBS color scales lead to the strongest privacy protection, while compression functions add little to the strength of privacy protection. Statistical analysis also shows that recognition algorithms and human subjects perceive the proposed protection similarly
Crowdsourcing is becoming a popular cost effective alternative to lab-based evaluations for subjective quality assessment. However, crowd-based evaluations are constrained by the limited availability of display devices used by typical online workers, which makes the evaluation of high dynamic range (HDR) content a challenging task. In this paper, we investigate the feasibility of using low dynamic range versions of original HDR content obtained with tone mapping operators (TMOs) in crowdsourcing evaluations. We conducted two crowdsourcing experiments by employing workers from Microworkers platform. In the first experiment, we evaluate five HDR images encoded at different bit rates with the upcoming JPEG XT coding standard. To find best suitable TMO, we create eleven tone-mapped versions of these five HDR images by using eleven different TMOs. The crowdsourcing results are compared to a reference ground truth obtained via a subjective assessment of the same HDR images on a Dolby `Pulsar' HDR monitor in a laboratory environment. The second crowdsourcing evaluation uses semantic differentiators to better understand the characteristics of eleven different TMOs. The crowdsourcing evaluations show that some TMOs are more suitable for evaluation of HDR image compression.
High dynamic range (HDR) imaging is able to capture a wide range of luminance values, closer to what the human eye can perceive. However, for capture and display technologies, it is important to answer the question on the significance of higher dynamic range for user preference. This paper answers this question by investigating the added value of higher dynamic range via a rigorous set of subjective experiments using paired comparison methodology. Video sequences at four different peak luminance levels were displayed side-by-side on a Dolby Research HDR RGB backlight dual modulation display (aka ‘Pulsar’), which is capable of reliably displaying video content at 4000 cd=m2 peak luminance. The results of the subjective experiment demonstrate that the preference of an average viewer increases logarithmically with the increase in the maximum luminance level at which HDR content is displayed, with 4000 cd=m2 being the most attractive option.
The ability of high dynamic range (HDR) to capture details in environments with high contrast has a significant impact on privacy in video surveillance. However, the extent to which HDR imaging affects privacy, when compared to a typical low dynamic range (LDR) imaging, is neither well studied nor well understood. To achieve such an objective, a suitable dataset of images and video sequences is needed. Therefore, we have created a publicly available dataset of HDR video for privacy evaluation PEViD-HDR, which is an HDR extension of an existing Privacy Evaluation Video Dataset (PEViD). PEViD-HDR video dataset can help in the evaluations of privacy protection tools, as well as for showing the importance of HDR imaging in video surveillance applications and its influence on the privacy-intelligibility trade-off. We conducted a preliminary subjective experiment demonstrating the usability of the created dataset for evaluation of privacy issues in video. The results confirm that a tone-mapped HDR video contains more privacy sensitive information and details compared to a typical LDR video.
Proc. SPIE. 9138, Optics, Photonics, and Digital Technologies for Multimedia Applications III
KEYWORDS: Image fusion, Facial recognition systems, Detection and tracking algorithms, Cameras, Video, Photography, Video surveillance, Light sources and illumination, Range imaging, High dynamic range imaging
The ability of High Dynamic Range imaging (HDRi) to capture details in high-contrast environments, making both dark and bright regions clearly visible, has a strong implication on privacy. However, the extent to which HDRi affects privacy when it is used instead of typical Standard Dynamic Range imaging (SDRi) is not yet clear. In this paper, we investigate the effect of HDRi on privacy via crowdsourcing evaluation using the Microworkers platform. Due to the lack of HDRi standard privacy evaluation dataset, we have created such dataset containing people of varying gender, race, and age, shot indoor and outdoor and under large range of lighting conditions. We evaluate the tone-mapped versions of these images, obtained by several representative tone-mapping algorithms, using subjective privacy evaluation methodology. Evaluation was performed using crowdsourcing-based framework, because it is a popular and effective alternative to traditional lab-based assessment. The results of the experiments demonstrate a significant loss of privacy when even tone-mapped versions of HDR images are used compared to typical SDR images shot with a standard exposure.
Extensive adoption of video surveillance, affecting many aspects of our daily lives, alarms the public about the increasing invasion into personal privacy. To address these concerns, many tools have been proposed for protection of personal privacy in image and video. However, little is understood regarding the effectiveness of such tools and especially their impact on the underlying surveillance tasks, leading to a tradeoff between the preservation of privacy offered by these tools and the intelligibility of activities under video surveillance. In this paper, we investigate this privacy-intelligibility tradeoff objectively by proposing an objective framework for evaluation of privacy filters. We apply the proposed framework on a use case where privacy of people is protected by obscuring faces, assuming an automated video surveillance system. We used several popular privacy protection filters, such as blurring, pixelization, and masking and applied them with varying strengths to people's faces from different public datasets of video surveillance footage. Accuracy of face detection algorithm was used as a measure of intelligibility (a face should be detected to perform a surveillance task), and accuracy of face recognition algorithm as a measure of privacy (a specific person should not be identified). Under these conditions, after application of an ideal privacy protection tool, an obfuscated face would be visible as a face but would not be correctly identified by the recognition algorithm. The experiments demonstrate that, in general, an increase in strength of privacy filters under consideration leads to an increase in privacy (i.e., reduction in recognition accuracy) and to a decrease in intelligibility (i.e., reduction in detection accuracy). Masking also shows to be the most favorable filter across all tested datasets.
Visual privacy protection, i.e., obfuscation of personal visual information in video surveillance is an important and increasingly popular research topic. However, while many datasets are available for testing performance of various video analytics, little to nothing exists for evaluation of visual privacy tools. Since surveillance and privacy protection have contradictory objectives, the design principles of corresponding evaluation datasets should differ too. In this paper, we outline principles that need to be considered when building a dataset for privacy evaluation. Following these principles, we present new, and the first to our knowledge, Privacy Evaluation Video Dataset (PEViD). With the dataset, we provide XML-based annotations of various privacy regions, including face, accessories, skin regions, hair, body silhouette, and other personal information, and their descriptions. Via preliminary subjective tests, we demonstrate the flexibility and suitability of the dataset for privacy evaluations. The evaluation results also show the importance of secondary privacy regions that contain non-facial personal information for privacy- intelligibility tradeoff. We believe that PEViD dataset is equally suitable for evaluations of privacy protection tools using objective metrics and subjective assessments.
High-dynamic range (HDR) imaging is expected, together with ultrahigh definition and high-frame rate video, to become a technology that may change photo, TV, and film industries. Many cameras and displays capable of capturing and rendering both HDR images and video are already available in the market. The popularity and full-public adoption of HDR content is, however, hindered by the lack of standards in evaluation of quality, file formats, and compression, as well as large legacy base of low-dynamic range (LDR) displays that are unable to render HDR. To facilitate the wide spread of HDR usage, the backward compatibility of HDR with commonly used legacy technologies for storage, rendering, and compression of video and images are necessary. Although many tone-mapping algorithms are developed for generating viewable LDR content from HDR, there is no consensus of which algorithm to use and under which conditions. We, via a series of subjective evaluations, demonstrate the dependency of the perceptual quality of the tone-mapped LDR images on the context: environmental factors, display parameters, and image content itself. Based on the results of subjective tests, it proposes to extend JPEG file format, the most popular image format, in a backward compatible manner to deal with HDR images also. An architecture to achieve such backward compatibility with JPEG is proposed. A simple implementation of lossy compression demonstrates the efficiency of the proposed architecture compared with the state-of-the-art HDR image compression.
Proc. SPIE. 8499, Applications of Digital Image Processing XXXV
KEYWORDS: Cell phones, Image compression, Image processing, Computer programming, Tablets, Image quality, Video compression, Light sources and illumination, High dynamic range imaging, Algorithm development
High Dynamic Range (HDR) imaging is expected to become one of the technologies that could shape next
generation of consumer digital photography. Manufacturers are rolling out cameras and displays capable of
capturing and rendering HDR images. The popularity and full public adoption of HDR content is however
hindered by the lack of standards in evaluation of quality, file formats, and compression, as well as large legacy
base of Low Dynamic Range (LDR) displays that are unable to render HDR. To facilitate wide spread of HDR
usage, the backward compatibility of HDR technology with commonly used legacy image storage, rendering,
and compression is necessary. Although many tone-mapping algorithms were developed for generating viewable
LDR images from HDR content, there is no consensus on which algorithm to use and under which conditions.
This paper, via a series of subjective evaluations, demonstrates the dependency of perceived quality of the tone-mapped
LDR images on environmental parameters and image content. Based on the results of subjective tests, it
proposes to extend JPEG file format, as the most popular image format, in a backward compatible manner to also
deal with HDR pictures. To this end, the paper provides an architecture to achieve such backward compatibility
with JPEG and demonstrates efficiency of a simple implementation of this framework when compared to the
state of the art HDR image compression.
Media streaming has found applications in many domains such as education, entertainment, communication
and video surveillance. Many of these applications require non-trivial manipulations of media streams, beyond
the usual capture/playback operations supported by typical multimedia software and tools. To support rapid
development of such applications, we have designed and implemented a scripting language called Plasma. Plasma
treats media streams as first-class objects, and caters to the characteristic differences between stored media files
and live media streams. In this paper, we illustrate the design and features of Plasma through several small
examples, and describe two example applications that we developed on top of Plasma. These two applications
demonstrate that using Plasma, complex applications that compose, mix, and filter multimedia streams can be
written with relatively little effort.