The ongoing growth in digital video content is increasing the need for efficient and effective systems for video indexing and retrieval. Progress is gated by developments along a number of fronts including development of tools for content annotation, invention of methods for automatic labeling and computer-assisted annotation, investigation of new techniques for indexing and retrieval, and development of standards for interoperation of video indexing and retrieval systems. MPEG-7 addresses the interoperability requirements by providing a metadata system for describing multimedia content using XML. In general, MPEG-7 allows interoperable indexing, searching, and retrieval of video, images, audio, and other forms of multimedia data. More specifically for video, MPEG-7 standardizes a number of content description tools that allow effective indexing and retrieval of video content. These tools include video segment description tools, textual annotation and transcription description tools, feature description tools, and semantics and model description tools. In this paper, we investigate the application of MPEG-7 for indexing and retrieval of video content and give examples of MPEG-7 video descriptions.
Video surveillance is gaining increasing popularity as a possible response to various threats such as terrorism, vandalism and crime. The need for automated analysis of the events monitored by video cameras and support for fast search and browsing of such recorded video data is evident. In this paper we present VISSTM, a prototype system that uses advanced video segmentation and MPEG-7 technology to analyse and index visual events in real time. Visual features such as shape, colour and texture are extracted and used to describe the images stored on the system. A search of large volumes of data can be performed very quickly. We show examples of the fast search made possible with VISSTM.
We present a novel low-complexity content-based browsing system for personal video recorders. It provides convenient access to any part of the content with an integrated browser-player that uses unique rapid summarization and indexing with compressed domain color and motion features, as well as audio features. Our summarization is similar in accuracy to other competing techniques and is computationally much simpler.
This paper presents a pair of applications based on MPEG-7, which are the description generator and the browser. The description generator is a semi-automatic tool that allows content providers to produce useful descriptions. Consuming these descriptions, the browser allows to search and browse contents with functionalities such as multi-level highlight browse, keyword/color-based content search and so on. Automatic shot segmentation and artificial text region detection tools are integrated into the description generator for an efficient video indexing.
We briefly describe the process for creating an MP4 file and introduce the software tools used for the creation. Then, we describe the architecture of an MP4 player - Flavor Player - that implements the MPEG-4 Systems specification. The Flavor Player implements 2-D composition and depth ordering of objects, object animation, user interaction, MPEG-J, IPMP framework, and MP4 file support. Additionally, we describe a simplified version of the Flavor Player - Mild Flavor - that only implements the Object Descriptor Profile. Unlike the Flavor Player, Mild Flavor is also used to create and edit MP4 files in addition to playback.
In this paper, we present a real-time adaptive streaming video platform. This platform is fully compliant with the Internet Streaming Media Alliance Implementation Specification. It has been used for experiments of Real-time video streaming and transcoding via unicast and multicast over heterogeneous networks. One of the examples of streaming video over a lossy channel is given, and a simple and efficient scheme for the packet loss recovery is presented.
Current products supporting video communication applications rely on existing computer architectures. RISC processors have been used successfully in numerous applications over several decades. DSP processors have become ubiquitous in signal processing and communication applications. Real-time applications such as speech processing in cellular telephony rely extensively on the computational power of these processors. Video processors designed to implement the computationally intensive codec operations have also been used to address the high demands of video communication applications (e.g., cable set-top boxes and DVDs). This paper presents an overview of a system-on-chip (SOC) architecture used for real-time video in wireless communication applications. The SOC specifications answer to the system requirements imposed by the application environment. A CAM-based video processor is used to accelerate data intensive video compression tasks such as motion estimations and filtering. Other components are dedicated to system level data processing and audio processing. A rich set of I/Os allows the SOC to communicate with other system components such as baseband and memory subsystems.
A layered video multicast framework for differentiated service (DS) networks, which provides various levels of QoS guarantee for heterogeneous users with improved performance in network congestion adaptation, is examined in this research. The proposed system consists of three key components: extended active queue management, hierarchical priority marking, and receiver-driven layered multicast with ECN (RLME). Particularly, we introduce RLME protocol that effectively utilizes advanced features of active queues such as random early drop (RED) and early congestion notification (ECN) in DS networks. The RLME protocol quantitatively estimates the network congestion level via ECN and packet loss to improve adaptation capability to network congestion, and utilizes the priority service from DS networks to minimize the packet loss effect on reconstructed video quality. The simulation shows that the proposed system successfully achieves stable and controllable video QoS guarantee for heterogeneous video clients over DS networks.
A new inter-client synchronization framework for one-to-many (i.e., multicast) media streaming is proposed employing a server-client coordinated adaptive playout control. The proposed adaptive player controls the playback speed of audio and video by adopting the time-scale modification of audio. Based on the overall synchronization status as well as the buffer occupancy level, the playout speed of each client is manipulated within a perceptually tolerable range. Additionally, the server implicitly helps increasing the time available for retransmission while the clients perform an interactive error recovery mechanism with the assistance of playout control. By coordinating the playout speed of each client, the inter-client synchronization with respect to the target presentation time is smoothly achieved. RTCP-compatible signalling between the server and group-clients is performed, where the exchange of controlling message is restricted. The network-simulator based simulations show that the proposed framework can reduce the playout discontinuity without degrading the media quality, and thus mitigate the client heterogeneity.
With the advent of high-speed access network technologies such as ADSL, increasing numbers of Internet users are participating in various interactive multimedia applications. Among these, the most popular are the massively interactive on-line games, or MMPOGs. In MMPOGs, a large amount of event data is associated with various control objects. This event data has different characteristics from that which is generally used on the Internet. Namely, events occur very frequently with short inter-arrival times and their size is quite small, because they only contain control information. Most commercial MMPOGs use TCP or UDP as the transport protocol for the event data. However, since TCP is such a heavy protocol, due to its complex congestion control algorithm and byte-oriented window scheme, it is difficult to support many concurrent users. On the other hand, UDP is a relatively lightweight protocol, but there are no functions available which permit reliable transmission and session management. In this paper, we propose a new transport protocol, Game Transport Protocol (GTP), which is designed for the transmission of the event data used by MMPOGs. GTP supports several functions designed to meet the various requirements of MMPOGs. Firstly, GTP uses a packet-based window scheme not a byte-based window scheme as in the case of TCP. This scheme is quite simple and suitable for the small size of the event data. Also, GTP performs session management and retransmission using GTP control blocks, and supports an adaptive retransmission scheme that controls the maximum number of retransmissions according to the real-time priority, in order to meet the time constraints of the event data. Although GTP is a specialized transport protocol, optimized for MMPOGs, it could also be utilized as a transport protocol for other interactive multimedia applications.
Legacy buffer cache management schemes for multimedia server are grounded at the assumption that the application sequentially accesses the multimedia file. However, user access pattern may not be sequential in some circumstances, for example, in distance learning application, where the user may exploit the VCR-like function(rewind and play) of the system and accesses the particular segments of video repeatedly in the middle of sequential playback. Such a looping reference can cause a significant performance degradation of interval-based caching algorithms. And thus an appropriate buffer cache management scheme is required in order to deliver desirable performance even under the workload that exhibits looping reference behavior. We propose Adaptive Buffer cache Management(ABM) scheme which intelligently adapts to the file access characteristics. For each opened file, ABM applies either the LRU replacement or the interval-based caching depending on the Looping Reference Indicator, which indicates that how strong temporally localized access pattern is. According to our experiment, ABM exhibits better buffer cache miss ratio than interval-based caching or LRU, especially when the workload exhibits not only sequential but also looping reference property.
In this paper, we present the findings of a study on the radio frequency (RF) signal switching and distribution techniques in a civil aviation aircraft. Using the Boeing Aircraft 777 as a model, method and mode of RF signal switching and distribution were investigated. The aim is to evaluate system performance and if possible determine methods of improvement. The performance of the system was measured in terms of savings in system parameters such as weight, size and length of the associated components. Instead of using coaxial cables or twisted pair wire for routing the RF signals, optical fibers cables were suggested as a method of improvement. During this study, the difficulty of achieving this objective became obvious due to the complexity of the problem. However, suggestions were made on possible methods of improvements.
Due to the excitement of Internet and high bandwidth, there are more and more multimedia applications involving digital industry. However the storage and the real-time of the conventional storage architecture cannot cater for the requirements of continuous media. The most important storage architecture used in past is Direct Attached Storage (DAS) and RAID cabinet, and recently, both Network Attached Storage (NAS) and Storage Area Networks (SAN) are the alterative storage network topology. But as for the multimedia characters, there need more storage capacity and more simultaneous streams. In this paper, we have introduced a novel concept 'Unified Storage Network' (USN) to build efficient SAN over IP, to bridge the gap of NAS and SAN, furthermore to resolve the scalability problem of storage for multimedia applications.
The Data Over Cable Service Interface Specifications (DOCSIS) of the Multimedia Cable Network System (MCNS) organization intends to support IP traffics over HFC (hybrid fiber/coax) networks with significantly higher data rates than analog modems and Integrated Service Digital Network (ISDN) links. The availability of high speed-access enables the delivery of high quality audio, video and interactive services. To support quality-of-service (QoS) for such applications, it is important for HFC networks to provide effective medium access and traffic scheduling mechanisms. In this work, a novel scheduling mechanism and a new bandwidth allocation scheme are proposed to support multimedia traffic over DOCSIS (Data Over Cable System Interface Specification)-compliant cable networks. The primary goal of our research is to improve the transmission of real-time variable bit rate (VBR) traffic in terms of throughput and delay under DOCSIS. To support integrated services, we also consider the transmission of constant bit rate (CBR) traffic and non-real-time traffic in the simulation. To demonstrate the performance, we compare the result of the proposed scheme with that of a simple multiple priority scheme. It is shown via simulation that the proposed method provides a significant amount of improvement over existing QoS scheduling services in DOCSIS. Finally, a discrete-time Markov model is used to analyze the performance of the voice traffic over DOCSIS-supported cable networks.
A dynamic mode-weighted error concealment method is proposed for video packets transmitted over noisy channels in this work. We first introduce two error concealment approaches. One is to reconstruct lost pixels by interpolating candidate pixels indicated by neighboring motion vectors. The other is to estimate the motion vector by a side matching algorithm. Four corrupted block reconstruction modes are described based on the two error concealment approaches. Then, the value of an erroneous pixel is replaced by a weighted sum of those reconstructed by two modes. The property of the weighted sum is analyzed. It is shown that the optimal weighting coefficients can be expressed as a formula in terms of the error variance and the correlation coefficients associated with the reconstruction modes. Furthermore, based on the decoder-based error tracking model, these weighting coefficients are dynamically updated to minimize the instant propagation and concealment error variance.
Extensive simulations are provided to demonstrate that the proposed method can lead to a satisfying performance in an error-prone environment.
This paper proposes a dynamic bit rate control method for real-time video streaming over the Internet. It is based on the feedback mechanism using RTCP(RTP control protocol) which provides network congestion parameters such as inter-arrival jitter, fraction lost, and round trip time by SR (Sender Report)/RR( Receiver Report). The proposed method firstly detects network congestion, then the network state is categorized into four congestion levels by analyzing the network congestion parameters such as jitter and packet loss derived from RR packets arriving periodically, then the coding bit rate is determined according to the current congestion level. The proposed dynamic bit rate control mechanism has also been implemented in the MPEG-4 video transmission system. The experimental results show that the proposed method can successfully suppress the packet loss and control the coding bit rate appropriately even for a congested network which does not guarantee QoS such as bandwidth resources and/or maximum delay.
Quality of Service (QoS) is an important issue in the next generation wireless networks providing multimedia services. In this paper, we address the connection-level QoS provisioning in wireless multimedia networks, measured by the connection blocking and dropping probabilities. The connection-level QoS for multimedia services are guaranteed by achieving the minimum connection blocking probability subject to the constraint on the handoff dropping probability. A dynamic call admission control scheme is proposed to provide connection-level QoS in wireless multimedia networks. This scheme adopts a novel strategy called prompt-decreasing/timer-increasing (PDTI) to dynamically adjust the threshold for handoff channel reservation. It can maintain the handoff dropping probability at a target rate predefined in the system specification, while maximizing resource utilization and minimizing the new call blocking rate. The proposed solution is a measurement-based method that is practical for real-world deployment. Simulations are carried out to prove the efficiency of the proposed PDTI scheme.
In traditional packet voice or the emerging 2.5G and 3G wireless data services, smooth and timely delivery of audio is an essential requirement in Quality of Service (QoS) provision. It has been shown in our previous work that, by adapting time-scale modification to audio signals, an adaptive play-out algorithm can be designed to minimize packet dropping at the receiver end. By stretching the audio frame duration up and down, the proposed algorithm could adapt quickly to accommodate fluctuating delays including delay spikes.
In this paper, we will address the packet audio QoS with emphasis on end-to-end delay, packet loss, and delay jitter. The characteristics of delay and loss will be discussed. Adaptive playback will enhance the audio quality by adapting to the transmission delay jitter and delay spike. Coupled with Forward Error Correction (FEC) schemes, the proposed delay and loss concealment algorithm achieves less overall application loss rate without sacrificing on the average end-to-end delay. The optimal solution of such algorithms will be discussed. We also investigate the stretching-ratio transition effect on perceived audio quality by measuring the objective Perceptual Evaluation of Speech Quality (PESQ) Mean Opinion Score (MOS).
Although VBR video has been characterized as self-similar by various researchers, models based on self-similarity considerations have not been previously studied. This paper investigates the application of discrete-time scale invariant systems to modeling variable-bit rate (VBR) video traces. The motivation for this study lies in the fact that the model discussed here evolves out of self-similarity considerations. Potential application of this system to classifying content-based scenes in VBR video is explored. This paper also demonstrates that using heavy-tailed stable inputs these models can match both the scene time-series correlations as well as scene density functions.
One of the straightforward ways to add a watermark to an image in the spatial domain is to add a pseudo-random noise pattern to the original image. The noise pattern can be generated based on a seed. To detect the watermark in an image, the image is correlated with the noise pattern and the correlation is compared to a preset threshold. Important considerations of the above mentioned correlation-based watermarking techniques are the probability of correct detection and the probability of false alarm. In this paper, we present a method of using "ill-posed" operator to pre-process the noise pattern. The watermark is obtained by pre-multiplying a noise pattern by the inverse of an "ill-posed" operator. An "ill-posed" operator has a large conditional number, i.e., the ratio of the largest singular value to the smallest singular value. Because of the large conditional number, the inverse of an "ill-posed" operator has a large change in the output when the input changes slightly. In watermarking, the "ill-posedness" can be exploited to improve the performance of correlation-based watermarking because of the pseudo-random patterns generated by different seeds have very low correlation with each other and this feature is amplified by the inverse of the "ill-posed" operator. The "ill-posed" operator can be obtained from a wide range of fields such as heat profusion, acoustic wave propagation, and Laplacian equation. Compared with the standard correlation-based watermark, the new watermark has smaller payload and approximately the same probability of correct detection. In addition the new watermark has much lower probability of false alarm. In the paper, we describe the "ill-posed" operator in details and use examples to demonstrate the performance of the watermark.
In this paper, a new optodigital multiple information hiding and real-time extraction system is suggested. In the process of multiple information hiding, stego keys are generated by combined use of PRS (pseudo-random sequence) and HM (Hadamard matrix) and then, they are used to hide multiple data in an arbitrary cover image without crosstalks. To extract multiple information hidden in the stego image in real-time, a new optical NJTC(nonlinear joint transform correlator)-based extraction system is introduced. In this optical extraction system, both the stego image and each of stego keys are placed at the input plane of the correlator and jointly Fourier transformed. And, the power spectrum of the jointly Fourier transformed signal is detected at the spatial frequency domain and inversely Fourier transformed again. Then, the final correlation peaks between them can be found in the correlation plane as an authentic signal. From good experimental results on multiple information hiding and optical extraction using Arabic numerials of "1", "2" and "3", a possibility of implementation of a new optodigital multiple information hiding and real-time extraction system is suggested.
It is accepted that stream cryptosystem can achieve good real-time performance and flexibility which implements encryption by selecting few parts of the block data and header information of the compressed video stream. Chaotic random number generator, for example Logistics Map, is a comparatively promising substitute, but it is easily attacked by nonlinear dynamic forecasting and geometric information extracting. In this paper, we present a hyperchaotic cryptography scheme to encrypt the compressed video, which integrates Logistics Map with Z(232 - 1) field linear congruential algorithm to strengthen the security of the mono-chaotic cryptography, meanwhile, the real-time performance and flexibility of the chaotic sequence cryptography are maintained. It also integrates with the dissymmetrical public-key cryptography and implements encryption and identity authentification on control parameters at initialization phase. In accord with the importance of data in compressed video stream, encryption is performed in layered scheme. In the innovative hyperchaotic cryptography, the value and the updating frequency of control parameters can be changed online to satisfy the requirement of the network quality, processor capability and security requirement. The innovative hyperchaotic cryprography proves robust security by cryptoanalysis, shows good real-time performance and flexible implement capability through the arithmetic evaluating and test.
We investigate the task of wide format still image manipulation and compression, within the framework of a wide format document data path. For such systems, the constraints are put on performance and cost: the use of data compression aims at reducing both storage cost and the transfer times, which are critical for wide format printing systems. Nevertheless, different factors reduce the overall system performance and usability; using inadequate compression algorithms for an inadequate document content -- that can mix text, graphics and photographs --minimizes its global usefulness. Other factors limit its usability, as the non-compatibility of compressed data-stream with basic image transformations or the pixel encoding route mostly in a raster order, that do not allow random image access. In our article, we survey the adequation of some of the existing standards and compressed file formats, with respect to the constraints of the large format document systems
High quality video conferencing is an efficient tool for interactive scientific collaboration in the research community, especially for researchers separated by substantial distance. With the wide deployment of broadband wide area IP networks such as the Internet2, there is an increasing demand for improved remote collaboration with these networks. In order to make the high quality video-conferencing toolkits for local high-speed networks available over wide area IP networks, issues that are usually insignificant on local area networks must be considered. To this end, we have developed called Adaptation Layer Translator (ALX) in order to address these issues and solve the problems associated with real-time video and audio transmission over wide area IP networks. A conference control protocol is developed to coordinate the participants in an ALX-based conference. The ALX is also designed to be able to adapt to heterogeneous network environments at different deployment sites.
The Program System Information Protocol (PSIP), which provides system information about the contents of the MPEG-2 transport stream emitted by a broadcaster, is a set of tables that provides information for a receiver about which process should get which packets. In this paper we describe the elements of the PSIP tables, and we implemented the PSIP parser and an electronic program guide (EPG) browser while using DTV broadcasting. Electronics manufacturers use PSIP data to construct interactive EPG to aid the navigation of information from all the channels in DTV receivers. The PSIP also provides various solutions for navigating enhanced EPG, which satisfies the requirements of broadcasters and consumers.
In many real-time fields the sustained high-speed data recording system is required. This paper proposes a high-speed and sustained data recording system based on the complex-RAID 3+0. The system consists of Array Controller Module (ACM), String Controller Module (SCM) and Main Controller Module (MCM). ACM implemented by an FPGA chip is used to split the high-speed incoming data stream into several lower-speed streams and generate one parity code stream synchronously. It also can inversely recover the original data stream while reading. SCMs record lower-speed streams from the ACM into the SCSI disk drivers. In the SCM, the dual-page buffer technology is adopted to implement speed-matching function and satisfy the need of sustainable recording. MCM monitors the whole system, controls ACM and SCMs to realize the data stripping, reconstruction, and recovery functions. The method of how to determine the system scale is presented. At the end, two new ways Floating Parity Group (FPG) and full 2D-Parity Group (full 2D-PG) are proposed to improve the system reliability and compared with the Traditional Parity Group (TPG). This recording system can be used conveniently in many areas of data recording, storing, playback and remote backup with its high-reliability.
A speech production model can be ivided into three parts, namely the glottal source, articulation and radiation, respectively. Some digital watermarking methods for speech that have been proposed are based on modifying quantized values or parameters of a coding scheme. In this paper, we propose a new watermarking method for speech by manipulating the articulaton in the process of speech production. The proposed method is performed by modeling a quasi vocal tract model equivalent to the speech production process. The watermarked vocal tract model is expressed by codebooks made by LSP(Line Spectrum Pair) parameters. The procedure of watermark for speech is as follows; 1) LSPs are extracted from the speech. 2) Some of the extracted LSPs are replaced by the codebook vectors. 3) Speech is synthesized using replaced LSPs. In the process above, watermarks are embedded indirectly into the speech. Evaluation tests on speech quality and accuracy of the proposed method will be discussed with simulation results.
At present, one of the most popular services through internet is on-demand services including VOD, EOD and NOD. But the main problems for on-demand service are excessive load of server and insufficiency of network resources. Therefore the service providers require a powerful expensive server and clients are faced with long end-to-end delay and network congestion problem.
This paper presents a new distributive web-caching technique for fluent VOD services using distributed proxies in Head-end-Network (HNET). The HNET consists of a Switching-Agent (SA) as a control node, some Head-end Nodes (HEN) as proxies and clients connected to HEN. And each HEN is composing a LAN. Clients request VOD services to server through a HEN and SA. The SA operates the heart of HNET, all the operations using proposed distributive caching technique perform under the control of SA. This technique stores some parts of a requested video on the corresponding HENs when clients connected to each HEN request an identical video. Thus, clients access those HENs (proxies) alternatively for acquiring video streams. Eventually, this fact leads to equi-loaded proxy (HEN).
We adopt the cache replacement strategy using the combination of LRU, LFU, remove streams from other HEN prior to server streams and the method of replacing the first block of video last to reduce end-to end delay.
A new error detection and correction method is proposed that relies on the correlations among the syntax parameters of an MPEG-2 bitstream. Since MPEG-2 has led to a variety of applications, the MPEG-2 video specification is quite flexible. The header parameters in a video-coding standard are very important, as the syntax elements, tables, and decoding processes all depend on the values of the header information. Therefore, transmission errors in the header information not only result in a serious visual degradation of the output video, but also cause an abnormal decoding process. A number of error detection and correction techniques have already been developed to recover the MPEG-2 visual quality. However, since most of these methods only consider macroblock data information including quantized DCT coefficients, they are unable to produce good results with videos that include errors in the header information. Accordingly, the current paper proposes a method for detecting and correcting bit errors in headers based on the correlations between header parameters, between consecutive pictures, and between macroblock data and header parameters. As a result, even if bit errors are generated in header parameters, which are crucial to successful decoding, experimental results showed that the proposed header error detection and correction method can improve the video quality without increasing the transmission bit rate.
The random early migration (REM) scheme was proposed in our previous work to balance the load of multiple media servers to decrease the average service delay. When an user request arrives, it is randomly directed to a media server that has the designated video content cached on. When the load of this server exceeds a preset threshold, REM is executed by choosing one of its in-service requests and migrating it to another media server with a certain probability, where the exact probability is a function of the service load. We introduce a state matrix representation that stores the service load information of each media server and plays an important role in the determination of migration paths. All possible state matrices can be mapped to a vector space called the state matrix space (SMS). With SMS, we can analyze the performance of VoD systems such as the failure rate and service delay, and these derived results are verified by numerical experiments. It is demonstrated that REM outperforms the normal migration scheme with shorter service delay and lower failure rates.
Many error resilience techniques have been proposed to improve the MPEG-4 coding performance. However, most of them can hardly detect and correct errors occurring in headers, motion vectors and macroblock mode data. MPEG-4/XML was introduced in our previous work to protect the important information of video against error corruption. However, the overhead of a straightforward XML description can be high. In this work, we develop an efficient XML compression algorithm to reduce the MPEG-4/XML file size. Furthermore, we present a XML-based MPEG-4 error resilience technique performed at the macroblock level. Experimental results are given to demonstrate the performance of the proposed MPEG-4/XML coding technique.
Matching Pursuit (MP) expands a signal over an overcomplete dictionary of normalized atoms in an iterative fashion. A careful selection of dictionary components is critical in the design of the MP algorithm for compact signal representation and manipulation. In this research, the use of MP as an alternative waveform-coding scheme for speech signals is investigated. The improvement of MP over conventional transform coding schemes is due to the use of overcomplete basis functions. Furthermore, the performance of MP representation can be enhanced via a compact MP dictionary obtained from training. Inspired by the popular Vector Quantization (VQ) algorithm, a dictionary-training algorithm is proposed in this paper to find the optimal dictionary for MP in speech coding. The MP decomposition with a trained dictionary is shown to improve the compactness of speech representation over the traditional MP decomposition with a generic Gabor dictionary. A better SNR performance is achieved with a dictionary of a limited size, which has a good potential for future appliations.
In the new MPEG-4 video coding standard, automatic video object segmentation plays a key role in supporting object-oriented coding and enabling content-based functionalities. Background subtraction is one of the basic automatic video object segmentation methods. But various environmental illumination conditions often make it hard to work. A robust background subtraction method is presented in this paper. A statistical background model is first setup in this algorithm. Then the hypothesis testing is applied to the following frames to segment the video objects. The HSV color model is used and its color components are efficiently analyzed and treated separately so that the proposed algorithm can adapt to different environmental illumination conditions. Shadows are detected and a new background update algorithm is also presented based on the observation that the illumination changes are temporal and will not influence all the following frames. All of them contribute to the robustness of the method. The experimental results show that the proposed background subtraction method can automatically segment video objects robustly and accurately in various illuminating environments.