Smartphones are becoming popular nowadays not only because of its communication functionality but also, more
importantly, its powerful sensing and computing capability. In this paper, we describe a novel and accurate image and
video based remote target localization and tracking system using the Android smartphones, by leveraging its built-in
sensors such as camera, digital compass, GPS, etc. Even though many other distance estimation or localization devices
are available, our all-in-one, easy-to-use localization and tracking system on low cost and commodity smartphones is
first of its kind. Furthermore, smartphones' exclusive user-friendly interface has been effectively taken advantage of by
our system to facilitate low complexity and high accuracy. Our experimental results show that our system works
accurately and efficiently.
Large LiDAR (Light Detection And Ranging) data sets are used to create depth mapping of objects and geographic
areas. The suitability of image compression methods for these large LiDAR data sets was explored, analyzed and
optimized. Our research interprets LiDAR data as intensity based "depth images", and uses k-means clustering, reindexing
and JPEG2000 to compress the data. The first step in our method applies the k-means clustering algorithm to
an intensity image creating a small index table, an index map and residual image. Next we use methods from previous
research to re-index the index map to optimize compression when using JPEG2000. And lastly we compress both the reindexed
map and residual image using JPEG2000, exploring the use of both lossless and lossy compression.
Experimental results show that in general we can compress data to 23% of the original size losslessly and even further
allowing for small amounts of loss.
The accuracy of motion estimation (ME) plays an important role in improving the coding efficiency of Wyner-Ziv video coding (WZVC). Most existing WZVC schemes perform ME at the decoder. The unavailability of the current frame on the decoder side usually impairs the accuracy of ME, which also causes the degradation of coding efficiency of WZVC. To improve the accuracy of ME, some works in the literature assume the current frame can be progressively decoded, and the decoder iteratively refines the motion field based on each partially decoded image. In this paper, we present an
analytical model to estimate the potential gain by employing multi-resolution motion refinement (MMR), assuming the current frame is progressively decoded in the frequency domain. The theoretical results show that at high rates, WZVC with MMR falls about 1.5 dB behind the conventional inter-frame coding, but outperforms WZVC with motion extrapolation by 0.9 to 5 dB. Significant gain has also been observed in the simulations using real video data.
In some applications such as real-time video applications, watermark detection needs to be performed in real time.
To address image watermark robustness against geometric transformations such as the combination of rotation, scaling,
translation and/or cropping (RST), many prior works choose exhaustive search method or template matching method to
find the RST distortion parameters, then reverse the distortion to resynchronize the watermark. These methods typically
impose huge computation burden because the search space is typically a multiple dimensional space. Some other prior
works choose to embed watermarks in an RST invariant domain to meet the real time requirement. But it might be
difficult to construct such an RST invariant domain. Zernike moments are useful tools in pattern recognition and image
watermarking due to their orthogonality and rotation invariance property. In this paper, we propose a fast watermark
resynchronization method based on Zernike moments, which requires only search over scaling factor to combat RST
geometric distortion, thus significantly reducing the computation load. We apply the proposed method to circularly
symmetric watermarking. According to Plancherel's Theorem and the rotation invariance property of Zernike moments,
the rotation estimation only requires performing DFT on Zernike moments correlation value once. Thus for RST attack,
we can estimate both rotation angle and scaling factor by searching for the scaling factor to find the overall maximum
DFT magnitude mentioned above. With the estimated rotation angle and scaling factor parameters, the watermark can be
resynchronized. In watermark detection, the normalized correlation between the watermark and the DFT magnitude of
the test image is used. Our experimental results demonstrate the advantage of our proposed method. The watermarking
scheme is robust to global RST distortion as well as JPEG compression. In particular, the watermark is robust to
print-rescanning and randomization-bending local distortion in Stirmark 3.1.
In this paper, we propose a novel source rate control algorithm for video streaming over the Internet. With the incorporation of a virtual network buffer management mechanism (VB), the QoS requirements of the application can be translated into the source rate constraints, based on which the source rate control is implemented. The maximum admissible bandwidth (or send rate) constraint, which is imposed by the encoder and decoder buffer sizes of the application, is also derived. To make sure the send rate not exceeding the maximum bandwidth constraint, a rate regulator using the token bucket approach is adopted at the application layer to limit the output rate of the encoder buffer when necessary. Simulation results show that our proposed algorithm can help the application reduce the overflow and underflow of the decoder buffer, and achieve better video quality and quality smoothness than traditional source rate control algorithms.
This paper presents a robust watermarking scheme based on multi-band wavelet and principle component analysis (PCA) technique. Incorporating the PCA technique, the developed blind watermarking in multi-band wavelet domain can successfully resist common signal processing such as JPEG compression with quality factor as low as 15, and geometric distortion such as cropping (cropped by 50%). Different from many other watermarking schemes, in which the watermark detection threshold is chosen empirically, the false positive of the proposed watermarking scheme could be calculated, so watermark detection threshold could be chosen based only on the target false positive. Comparing with similar watermarks in conventional two-band wavelet domain, greater perceptual transparency and more robustness could be achieved for the proposed watermarking scheme. The parameterized multi-band wavelet also leads to a more secure embedding domain, which makes attacks more difficult.
Content adaptation has been introduced to tailor the same media content to different derived user contexts to allow end users with different access networks, client devices and/or user profiles to access the same information source. However, content adaptation also introduces security implications in a content distribution food chain. In this paper, we conduct an in-depth investigation into the potential security issues involved in content adaptation in multimedia communication systems. We analyze the security requirements in the context of content adaptation. In particular, we address the issue of where to place the security functions and its implications on the security and content adaptation functionality. The general security architectures for the protection of adapted content are categorized and analyzed. Various rationales and implications of some of the most recent multimedia security technologies are investigated under these architectures. We also discuss some open issues and suggest some future directions. The paper provides the readers with an in-depth analysis, a comprehensive overview, and a better understanding of the security issues in a multimedia communication system where content adaptation is a necessity.
It is well known that tile boundary artefacts occur in lossy wavelet-based image coding. The base model of the JPEG2000 standard (ie JPEG2000 Part I) suffers from these artefacts, being a wavelet-based coding system. This paper analyses the tile boundary problems of JPEG2000 Part I and presents a novel method for reducing these tile boundary artefacts. This method has recently been adopted as part of the JPEG 2000 Verification Model 9.0 and as an addition to Part II of the JPEG2000 standard.
We describe a nonuniform quantization scheme for JPEG2000 that leverages the masking properties of the visual system, in which visibility to distortions declines as image energy increases. Derivatives of contrast transducer functions convey visual threshold changes due to local image content (i.e. the mask). For any frequency region, these functions have approximately the same shape, once the threshold and mask contrast axes are normalized to the frequency's threshold. We have developed two methods that can work together to take advantage of masking. One uses a nonlinearity interposed between the visual weighting and uniform quantization stage at the encoder. In the decoder, the inverse nonlinearity is applied before the inverse transform. The resulting image- adaptive behavior is achieved with only a small overhead (the masking table), and without adding image assessment computations. This approach, however, underestimates masking near zero crossings within a frequency band, so an additional technique pools coefficient energy in a small local neighborhood around each coefficient within a frequency band. It does this in a causal manner to avoid overhead. The first effect of these techniques is to improve the image quality as the image becomes more complex, and these techniques allow image quality increases in applications where using the visual system's frequency response provides little advantage. A key area of improvement is in low amplitude textures, in areas such as facial skin. The second effect relates to operational attributes, since for a given bitrate, the image quality is more robust against variations in image complexity.
Digital watermarking has been recently proposed as the mean for intellectual property right protection of multimedia data. We present some ways to 'visualize' the invisible watermarks, both statistically and perceptually, for proving the ownership. A system which is capable of embedding a good resolution meaningful binary watermark image and later extracting different versions of that watermark image with varying resolutions is proposed. The system has the nice feature that the watermark detector (rather than encoder) is allowed to adaptively choose the trade-off between robustness degree and resolution of the extracted watermark image. It takes advantage of the high spatial correlation of the watermark image and the human visual system's super ability to recognize a correlated pattern to enhance the detection performance. While a statistical technique which can quantify the false alarm detection probability should be considered as a fundamental measure for a valid ownership claim, the ability to extract a meaningful watermark image will greatly facilitate the process of convincing the jury of an ownership claim.
The huge success of the Internet permits the transmission and wide distribution and access of electronic data in an effortless manner. Content providers are faced with the challenge of how to protect their electronic data. This problem has generated a flurry of recent research activity in the area of digital watermarking of electronic content for copyright protection. Unlike the traditional visible watermark found on paper, the challenge here is to introduce a digital watermark that does not alter the perceived quality of the electronic content while being extremely robust to attack. For instance, in the case of image data, editing the picture or illegal tampering should not destroy or alter the watermark. Equally important, the watermark should not alter the perceived visual quality of the image. From a signal processing viewpoint, the two basic requirements for an effective watermarking scheme, robustness and transparency, conflict with each other. We propose a watermarking technique for digital images that is based on utilizing visual models which have been developed in the context of image compression. Specifically, we propose a watermarking scheme where visual models are used to determine image dependent modulation masks for watermark insertion. In other words, for each image we can determine the maximum amount of watermark signal that each portion of the image can tolerate without affecting the visual quality of the image. This allow us to provide the maximum strength watermark which in turn, is extremely robust to common image processing and editing such as JPEG compression, rescaling, and cropping. We have watermarking results in a DCT framework as well as a wavelet framework. The DCT framework allows the direct insertion of watermarks to JPEG -- compressed data whereas the wavelet based scheme provides a framework where we can take advantage of both a local and global approach. Our scheme is shown to provide dramatic improvement over the current state-of-the-art both in terms of transparency and robustness.
One way to efficiently combat channel errors is to employ unequal error protection (UEP) for information of different importance. The traditional approach to UEP through channel coding does not fully take advantage of the image signal properties. The high correlation nature of the image signal makes it possible to detect and correct many channel errors directly on the decoded image. Based on this observation, we propose in this paper a new approach which provides UEP through channel coding in a way seemingly contrast to traditional ones. Specifically, some good source detection/correction schemes are incorporated to detect/correct, directly on the decoded image, the noticeable damage which is often caused by channel errors affecting the important (in the view of tradition UEP schemes) data. With the help of source detectors, the problem of how to prioritize the transmitted data for channel protection is revisited and reformulated. We show that by intelligently combining unequal channel protection and source detection, both subjectively and objectively better quality of reconstructed images can be obtained. A general framework of the new approach is presented. Three case studies, including one for transmission of vector quantized images over noisy channels, one for a simplified fax coding system and the other one suggested by an intelligent block dropping approach, are examined.
The time varying nature of wireless channels can cause severe error bursts or dropouts. It is important to recover lost data in coded images in interactive video communication over wireless network. A good spatial interpolation strategy is considered as essential for replenishing missing blocks in still images and video frames. This paper proposes a novel spatial directional interpolation scheme which makes use of the local geometric information extracted from the surrounding blocks. Specifically, statistics of the local geometric structure are modeled as a bimodal distribution. The two nearest surrounding layers of pixels are converted into binary pattern to reveal the local geometric structure. A measure of directional consistency is employed to resolve ambiguity of possible connections of the transition points on the inner layer. The transition lines can be specified within one-pixel accuracy, unlike previous directional filtering schemes which usually filter along only one single direction chosen from a finite candidate set. The new approach produces results that are superior to that of other approaches, in terms of both peak signal-to-noise ratio (PSNR) and visual quality. The computation is also much reduced. It is observed that local structures such as edges, streaks and corners are well preserved in the reconstructed image.