A collaborative image processing method is proposed to enhance the shape and details of a scene through synthesizing a
set of multi-light images that capture a scene with fixed view-point but various lighting positions. A very challenging
problem is to remove the artifacts due to shadow edges from the synthesized image. To address this problem, a simple
Sobel filter based method is provided by utilizing the feature of multi-light images in which the shadow edges are
usually not overlapped. A detail layer that contains the details of all images is firstly constructed by using a gradient
domain method and a quadratic filter. Then a base layer is produced by using only one input image. The detail layer is
finally added to the base layer to produce the desired detail enhanced image. Using this method, the details lost in the
shadow of original input image can be reproduced by the details of other images and the sense of depth is preserved well
in the synthesized image. Interactivities are also provided for users to adjust the appearance of the detail enhanced image
according to their preferences.
This paper proposes an algorithm for generating a video signature based on an ordinal measure. Current methods which
use a measure of temporal ordinal rank are robust to many transformations but can only detect the entire query video, not
a segment of the query, while methods which use local features may be more robust to certain transformations but less
robust to excessive noise. The proposed algorithm incorporates region-based spatial information while maintaining a
strong robustness to noise, different resolutions, illumination shifts and video file formats. In our method, a frame is first
divided into blocks. For each pixel in a block, a slice (a binary image computed based on the comparison between the
greyscale intensity of each pixel in the frame and the reference pixel) is generated. The slices of all the pixels in a block
are then added component-wise to obtain a metaslice for the block. In order to compute the distance between any two
frames, the Euclidean distance between corresponding metaslices of the two frames is computed to obtain the
metadistance between two blocks. Summing the metadifferences over all blocks and normalizing give the final measure
of distance between the two frames. To improve the speed of the algorithm, keyframes are first downsized and pixel
intensity values are represented by the average of a small block. A table of frame differences between two sets of
keyframes from two video sequences is constructed and then converted to a similarity matrix using a threshold. The
longest chain of consecutive similar keyframes is found and this produces the best matching video sequence between the
two videos. This algorithm is capable of taking into account differences between videos at various scales and is useful
for finding duplicate or modified copies of a query video in a database. Preliminary experimental results are encouraging
and demonstrate the potential of the proposed algorithm.
In this paper, we describe a fine granularity scalable (FGS) video coding scheme that refines both residue and
motion information in the quality layers. Significant gains can be achieved when each enhancement layer undergoes
the motion compensation, prediction process with its own motion vector field (MVF). However, a motion
refined FGS scheme involves a motion estimation process for each enhancement layer of the scalable video. Given
the high computational cost of motion estimation in H.264, encoders can be computational expensive to implement.
Our proposed scheme carries out a simplified motion refinement scheme for enhancement layers, exploiting
the correlation of motion information between successive layers through macroblock (MB) type refinement. By
restricting the MB type of FGS layer MB according to the MB type of base layer MB, time required for encoding
FGS layers can be reduced. Through controlling the macroblock modes of macroblock in both the base and the
enhancement layers, the encoding time can be substantially reduced with minimal impact on coding efficiency.
The encoder optimization scheme we describe is especially effective when encoding a video with a low bitrate
base layer and a large range of extractable bitrates.
In this work, we developed and implemented an image capturing and processing system that equipped with capability of
capturing images from an input video in real time. The input video can be a video from a PC, video camcorder or DVD
player. We developed two modes of operation in the system. In the first mode, an input image from the PC is processed
on the processing board (development platform with a digital signal processor) and is displayed on the PC. In the second
mode, current captured image from the video camcorder (or from DVD player) is processed on the board but is displayed
on the LCD monitor. The major difference between our system and other existing conventional systems is that image-processing
functions are performed on the board instead of the PC (so that the functions can be used for further
developments on the board). The user can control the operations of the board through the Graphic User Interface (GUI)
provided on the PC. In order to have a smooth image data transfer between the PC and the board, we employed Real
Time Data Transfer (RTDX<sup>TM</sup>) technology to create a link between them. For image processing functions, we developed
three main groups of function: (1) Point Processing; (2) Filtering and; (3) 'Others'. Point Processing includes rotation,
negation and mirroring. Filter category provides median, adaptive, smooth and sharpen filtering in the time domain. In
'Others' category, auto-contrast adjustment, edge detection, segmentation and sepia color are provided, these functions
either add effect on the image or enhance the image. We have developed and implemented our system using C/C#
programming language on TMS320DM642 (or DM642) board from Texas Instruments (TI). The system was showcased
in College of Engineering (CoE) exhibition 2006 at Nanyang Technological University (NTU) and have more than 40
users tried our system. It is demonstrated that our system is adequate for real time image capturing. Our system can be
used or applied for applications such as medical imaging, video surveillance, etc.
In this paper, the conformance of hypothetical reference decoder (HRD) is addressed for H.264 when there is jitter among the transmission of packets via a channel without packet loss but variation among transmission delay. The sending rate is decoupled
from the coding one. Both the jitter and the total size of coded bitstream are taken into consideration such that the values of buffer size and initial buffer delay are minimized, especially
when the sending rate is greater than the coding one. Sufficient conditions are derived for the conformance of a coded bitstream to a HRD at the constant delay. These conditions are then used to design iterative algorithms to determine a minimal buffer size and a minimal initial buffer delay for the decoder. A novel
interpolation method is also presented such that it is suitable for a wide range of sending rates.
In spatial error concealment (SEC), methods like bilinear interpolation (BI) and directional interpolation (DI) are commonly used to estimate the missing pixel values resulting from losses occurring in video streams. Despite being able to preserve spatial smoothness, BI produces a blurring effect and is unable to preserve structural information. DI produces spurious edges in regions with no strong edges, resulting in visible artefacts. In this paper, we propose a SEC algorithm that addresses the above drawbacks by formulating a weighted sum of candidate macroblocks produced from DI and BI, with weights derived adaptively through local information. We demonstrate that the proposed algorithm offers visual improvements over both DI based algorithms and the SEC algorithm based on BI in the H.264/AVC reference software JM 12.0. Most importantly, this unique approach preserves edge information and spatial smoothness in the error concealed macroblock due to the integration of both BI and DI.
Efficient delivery of streaming media content over the Internet becomes an important area of research as such content is rapidly gaining its popularity. Many research works studied this problem based on the client-proxy-server structure and proposed various mechanisms to address this problem such as proxy caching and prefetching. While the existing techniques can improve the performance of accesses to reused media objects, they are not so effective in reducing the startup delay for first-time accessed objects. In this paper, we try to address this issue by proposing a more aggressive prefetching scheme to reduce the startup delay of first-time accesses. In our proposed scheme, proxy servers aggressively prefetch media objects before they are requested. We make use of servers' knowledge about access patterns to ensure the accuracy of prefetching, and we try to minimize the prefetched data size by prefetching only the initial segments of media objects. Results of trace-driven simulations show that our proposed prefetching scheme can effectively reduce the ratio of delayed requests by up to 38% with very marginal increase in traffic.
In this paper, we propose an interesting scheme to obtain a good tradeoff between motion information and
residual information for medium granular scalability (MGS). In this scheme, both motion information and residual
information are refined at enhancement layers when the scalable bit rate range is wide, whereas only residual
information is refined when the range is narrow. In other words, for the case of wide bit rate range, there can
be more than one motion vector fields (MVFs) where one is generated at base layer and others are generated at
enhancement layers. When it is narrow, only one MVF is necessary. The layers can either share one MVF or have
its own, depending on the bit rate range cross layers. Unlike Coarse Granular Scalability (CGS), the correlation
between two adjacent MVFs in MGS is very strong. Hence MGS can be provided in the most important bit
rate range to achieve a better tradeoff between motion and residual information and a finer granularity in that
range. CGS can be applied in less important bit rate ranges to give a coarse granularity. Experimental results
show that the coding efficiency can be improved by up to 1dB compared with existing SNR scalability scheme
at high bit rate.
Delivering streaming media content over the Internet is a very challenging problem. Proxy servers has been introduced into the streaming media delivery systems over the Internet, and many mechanisms have been proposed based on this structure, such as proxy caching and prefetching. While the existing techniques can improve the performance of accesses to reused media objects, they are not effective in reducing the startup delay for first-time accesses. In this paper, we propose a more aggressive server-assisted prefetching mechanism to reduce the startup delay of first-time accesses. In this aggressive prefetching mechanism, proxy servers prefetch media objects before they are requested. To ensure the accuracy of this beforehand prefetching, we make use of server's knowledge about access patterns to locate the most popular media objects and provide such information to proxy servers as hint for prefetching. A proxy server makes decision based on the hint and its users' profile and prefetches suitable objects before they are accessed. Results of trace-driven simulations show that our proposed mechanism can effectively reduce the ratio of delayed requests by up to 38% with very marginal increase in traffic.
Let the whole region be the whole bit rate range that customers are interested in, and a sub-region be a specific bit rate range. The weighting factor of each sub-region is determined according to customers' interest. A new type of region of interest (ROI) is defined for the SNR scalability as <i>the gap between the coding efficiency of SNR scalability scheme and that of the state-of-the-art single layer coding for a sub-region is a monotonically non-increasing function of its weighting factor.</i> This type of ROI is used as a performance index to design a customer oriented SNR scalability scheme. Our scheme can be used to achieve an optimal customer oriented scalable tradeoff (COST). The profit can thus be maximized.
In this paper, we propose a new method for removing coding artifacts appeared in JPEG 2000 coded images. The proposed method uses a fuzzy control model to control the weighting function for different image edges according to the gradient of pixels and membership functions. Regularized post-processing approach and recursive line algorithm are described in this paper. Experimental results demonstrate that the proposed algorithm can significantly improve image quality in terms of objective and subjective evaluation.