The discrete wavelet transform(DWT) is a tool extensively used in image processing algorithms. It can be used to decorrelate information from the original image, which can thus help in compressing the data for storage, transmission or other post-processing purposes. However, the finite nature of such images gives rise to edge artifacts in the reconstructed data. A commonly used technique to overcome this problem is a symmetric extension of the image, which can preserve zeroth order continuity in the data. This still produces undesirable edge artifacts in derivatives and subsampled versions of the image. In this paper we present an extension to Williams and Amaratunga's work of extrapolating the image data using a polynomial extrapolation technique before performing the forward or inverse DWT for biorthogonal wavelets. Comaparitive results of reconstructed data, with individual subband reconstruction as well as using the embedded zerotree coding (EZC) scheme, are also presented for both the aforementioned techniques.
We developed a new method for character generation that makes reading super fine characters (under 5 points) on a flat panel display easy. We removed the jaggy edges of a character image by subdividing the character image dots into the dots which size is one third as small as the original ones. This method is based on two characteristics of the human eye. These characteristics allowed us to map the subdivided value into display sub pixels and adjust the digital counts of sub pixels. We conducted a subjectivity evaluation experiment to evaluate the effectiveness of this method and found that the method is effective.
A computer 3D model is the central point of both the modern production process and its preparation by the concurrent design using. Such model often needs to be constructed from a physical part. It is so called Reverse Engineering process, which is performed through three-dimensional digitizing and CAD modeling. In the paper the low-cost 3D digitizer on video camera base is offered. The results of functional and structural modeling of this compact Reverse Engineering system are given. A method of the automated input of a number of the 2D images of a 3D irregular physical object, its processing and transformation into 3D computer model and then to the new physical object by the means of Rapid Prototyping is offered. The results of theoretical and practical studies of the parameters of the offered Reverse Engineering process realization are discussed.
Computer-based recognition of human facial expressions has been an active area of research since the 1970s. The ultimate goal is to realize intelligent man-machine interface. Recently, constructive One-Hidden-Layer Feedforward Neural Networks (OHL-FNNs) have been found promising for facial expression recognition. The hidden units in a FNN usually have the same activation functions typically selected as sigmoidal functions. However, it has not been proven that the use of the same activation functions for all the hidden units is the best or optimal choice in terms of network performance. In this paper, a new constructive polynomial OHL-FNN is proposed for pattern recognition. The well-known Hermite polynomials will be used as activation functions for the hidden units. Each time a new hidden unit is to be added to the network, a Hermite polynomial whose order is increased by one will be used as the activation function of the hidden unit. The proposed technique is applied to the facial expression recognition problem where the 2D DCT is performed over the entire face image before the resulting lower 2D DCT coefficients are fed to the constructive network training. The advantages and limitations of the constructive polynomial OHL-FNN for pattern recognition are also discussed.
Digitalized terrestrial broadcast can offer many programs to the TV viewer like communication satellite broadcasting. And it will spread in Japan. And image retrieval is necessary for us to watch many programs. Thereupon, we proposed method that particular scene is retrieved from video image. That retrieval method is to use the motion vector contained in MPEG-2 stream. Concretely, the movement of image is defined as the characteristics from information of motion vector. And the characteristic of a retrieval object and retrieval query scene are compared. In this way, target scene is retrieved. And we examined about efficacy of this retrieval method by using a television image.
Our previous work in the corruption model is extended in this research to cover a more general error propagation scenario, and then used to develop a joint AIR/UEP (adaptive intra-refresh/unequal error protection) scheme. Basically, we relate AIR to the importance of packets via the corruption model at the encoder. Once the intra refresh ratio is determined, which is currently assumed to be known either by a priori knowledge or the network feedback, the expected distortion for each macroblock is calculated based on the corruption model and AIR is performed accordingly. The corruption model exploits the initial error strength, which depends on the normative error concealment scheme, and motion vectors of previously coded frames. The expected loss impact of each macroblock is calculated concurrently by recycling the same corruption model in the reverse direction. Furthermore, if there is feedback about packet loss (e.g. ACK/NACK), the stored corruption models for lost macroblocks are updated by error tracking so that the updated models will affect subsequent AIR/UEP accordingly. Finally, packetization of protected data is considered for video transmission over IP networks.
This paper describes a new panoramic, 360 degree(s) video system and its use in a real application for virtual tourism. The development of this system has required to design new hardware for multi-camera recording, and software for video processing in order to elaborate the panorama frames and to playback the resulting high resolution video footage on a regular PC. The system makes use of new VR display hardware, such as WindowVR, in order to make the view dependent on the viewer's spatial orientation and so enhance immersiveness. There are very few examples of similar technologies and the existing ones are extremely expensive and/or impossible to be implemented on personal computers with acceptable quality. The idea of the system starts from the concept of Panorama picture, developed in technologies such as QuickTimeVR. This idea is extended to the concept of panorama frame that leads to panorama video. However, many problems are to be solved to implement this simple scheme. Data acquisition involves simultaneously footage recording in every direction, and latter processing to convert every set of frames in a single high resolution panorama frame. Since there is no common hardware capable of 4096x512 video playback at 25 fps rate, it must be stripped in smaller pieces which the system must manage to get the right frames of the right parts as the user movement demands it. As the system must be immersive, the physical interface to watch the 360 degree(s) video is a WindowVR, that is, a flat screen with an orientation tracker that the user holds in his hands, moving it like if it were a virtual window through which the city and its activity is being shown.
In this work we propose a new approach to model video data. To interpret the video semantic, we propose to model the video on the basis of the underlying dynamics contained in the video. Thus, the video is seen as a measurement of properties of objects embedded in the video and of their behaviors over time. The objects' behaviors are described by states and state transitions using statechart diagram. Then, this diagram is used to partition the video into meaningful segments. For efficient retrieval of information, we propose to use indexes based on the states of objects. The proposed model thus helps to store information about similar types of video data in a single database schema and supports content-based querying from a repository of video data.
A video streaming solution with the finer granular scalable (FGS) MPEG-4 stream delivered over prioritized networks is investigated in this work. An optimal truncation strategy for constant-quality rate adaptation is first presented by embedding the rate-distortion (R-D) information based on a piecewise linear R-D model. Then, the video stream is prioritized for differentiated dropping and forwarding, where rate adaptation is dynamically performed to meet the time-varying available bandwidth. It is shown that, although the prioritized stream benefits from the prioritized network, its gain is heavily dependent on how well video source and network priorities match each other. All key components, including FGS encoding, rate adaptation and packetization, error resiliency decoding and differentiated forwarding, are seamlessly integrated into one system. By focusing on the end-to-end quality, we set both source and network parameters properly to achieve a superior performance of FGS video streaming.
This paper reports a study on new, region-wide search and pursuit system for missing objects such as stolen cars, wandering people, etc. By using image matching processes on the basis of the object properties such as color and shape, the intelligent camera can search the object. Then the camera transmits the properties to the next camera to pursue the object successively. The experimental results show that the system can judge 2 cars as search object among 40 cars under conditions of changing environment. Based on these data the proposed system can accomplish a fundamental step. Finally, research subjects have been picked up for advancement such as accurate shape extraction processing, camera structure for high speed processing and multimedia attributes such as sound.