PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.
With all the hype created around multimedia in the last few years, consumers expect to be able to access multimedia content in a real-time manner, anywhere and anytime. One of the problems with the real-time requirement is that transportation networks, such as the Internet, are still prone to errors. Due to real-time constraints, retransmission of lost data is, more often than not, not an option. Therefore, the study of error resilience and error concealment techniques is of the utmost importance since it can seriously limit the impact of a transmission error. In this paper an evaluation of a part of flexible macroblock ordering, one of the new error resilience techniques in H.264/AVC, is made by analyzing its costs and gains in an error-prone environment. This paper concentrates on the study of flexible macroblock ordering (FMO). More specifically a study of scattered slices, FMO type 1, is made. Our analysis shows that FMO type 1 is a good tool to introduce error robustness into an H.264/AVC bitstream as long as the QP is higher than 30. When the QP of the bitstream is below 30, the cost of FMO type 1 becomes a serious burden.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In order to be able to better cope with packet loss, H.264/AVC, besides offering superior coding efficiency, also comes with a number of error resilience tools. The goal of these tools is to enable the decoding of a bitstream containing encoded video, even when parts of it are missing. On top of that, the visual quality of the decoded video should remain as high as possible. In this paper, we will discuss and evaluate one of these tools, in particular the data partitioning tool. Experimental results will show that using data partitioning can significantly improve the quality of a video sequence when packet loss occurs. However, this is only possible if the channel used for transmitting the video allows selective protection of the different data partitions. In the most extreme case, an increase in PSNR of up to 9.77 dB can be achieved. This paper will also show that the overhead caused by using data partitioning is acceptable. In terms of bit rate, the overhead amounts to approximately 13 bytes per slice. In general, this is less than 1% of the total bit rate. On top of that, using constrained intra prediction, which is required to fully exploit data partitioning, causes a decrease in quality of about 0.5 dB for high quality video and between 1 and 2 dB for low quality video.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we propose the macroblock-level adaptive dynamic resolution conversion (DRC) technique usable by encoder to decide to reduce the resolution of input image on block-by-block basis for better compression efficiency. By reducing the spatial resolution of the block in the proposed scheme, it provides additional compression. As a proper resolution of a block is selected adaptively in the rate-distortion optimized way, more flexible coding is supported to adapt to the feature of image. Simulation based on the state of the art codec H.264 standard demonstrates that the proposed scheme has better performance than H.264 in terms of rate-distortion.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The flickering effect is a serious problem of intra-only coding and is caused by different accuracy loss of transform
coefficients by the quantization process from frame to frame. Nevertheless, the study for its solution has never been
sufficient. In this paper, we analyze why flickering effect happens and illustrate our results of observation using intra-only
coding scheme of the H.264/AVC standard. Based on our analysis, we propose a flickering effect reduction scheme
which is a pre-processing method using the Kalman filtering algorithm. Simulation results show that the proposed
scheme increases subjective visual quality by removing flickering effect.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we investigate the use of existing audio codecs for the purpose of a high quality color ring-back-
tone service. First of all, we exploit the limitations of the enhanced variable rate codec (EVRC) in a view of
music quality because EVRC is a standard speech coder employed in a code division multiple access (CDMA)
system. In order to figure it out which current existing audio codec is suitable to deliver music over CDMA
or wideband CDMA (W-CDMA), several audio codecs such as two different versions of MPEG AAC and the
Enhanced AAC+ codec are reviewed. Next, the music quality of the audio codecs is compared with that of
EVRC, where the bit-rates of the audio codecs are set to be around 10 kbit/s because the color ring-back-tone
service using one of the audio codecs should be realized by replacing EVRC with it. The quality comparison is
performed by an informal listening test as well as an objective quality test. It is shown from the experiments
that the audio codecs provide better music quality than EVRC and among them, the Enhance AAC+ codec
operated at a bit-rate of 10 kbit/s with a sampling rate of 32 kHz can be considered as a new candidate for the
high quality color ring-back-tone service.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper assesses the media synchronization quality of preventive control schemes employed at sources for stored video and voice over a network. The preventive control techniques are required to try to avoid asynchrony (i.e., out of synchronization). We here deal with two preventive control techniques employed at sources: Advancement of transmission timing of media units (MUs) with network delay estimation and temporal resolution control of video. By experiment, we make a performance comparison among preventive control schemes which employ one or two of the preventive control techniques. Experimental results show that a scheme which exerts the advancement of transmission timing together with the temporal resolution control is the most effective in terms of the media synchronization quality.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper deals with a remote control system which controls a haptic interface device with another remote haptic interface device. Applications of the system include a remote drawing instruction system, a remote calligraphy system and a remote medical operation system. This paper examines the influence of network latency on the output quality of haptic media by subjective assessment in the remote drawing instruction system. As a result, we show that the instructor has smaller Mean Opinion Score (MOS) values than the learner, and the MOS value can be estimated with high accuracy from the summation of the network latency from an instructor's terminal to a learner's terminal and that in the opposite direction.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper deals with a system which conveys the haptic sensation experimented by a user to a remote user. In the
system, the user controls a haptic interface device with another remote haptic interface device while watching
video. Haptic media and video of a real object which the user is touching are transmitted to another user.
By subjective assessment, we investigate the allowable range and imperceptible range of synchronization error
between haptic media and video. We employ four real objects and ask each subject whether the synchronization
error is perceived or not for each object in the assessment. Assessment results show that we can more easily
perceive the synchronization error in the case of haptic media ahead of video than in the case of the haptic media
behind the video.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we explain two transport-related experimental results for networked haptic CVEs (collaborative
virtual environments). The first set of experiments evaluate the performance changes in terms of QoE (quality
of experience) with the haptic-based CVEs under different network settings. The evaluation results are then
used to define the minimum networking requirements for CVEs with force-feedback haptic interface. The second
experiments verify whether the existing haptics-specialized transport protocols can satisfy the networking QoE
requirements for the networked haptic CVEs. The results will be used to suggest in design guidelines for an
effective transport protocol for this highly-interactive (i.e., extremely low-delay latency at up to 1 kHz processing
cycle) haptic CVEs over the delay-crippled Internet.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper proposes a prototype realization of multi-view HD video transport system with synchronized multiplexing
over IP networks. The proposed synchronized multiplexing considers the synchronization during video
acquisition and the multiplexing for the interactive view-selection during transport. For the synchronized acquisition
from multiple HDV camcoders through IEEE 1394 interface, we estimate the timeline differences among
MPEG-2 compressed video streams by using global time of network between the cameras and a server and correct
timelines of video streams by changing the timestamp of the MPEG-2 system stream. Also, we multiplex a selected
number of acquired HD views at the MPEG-2 TS (transport stream) level for the interactive view-selection
during transport. Thus, with the proposed synchronized multiplexing scheme, we can display synchronized HD
view.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Networked/Wireless Multimedia Systems and Technology
We propose a P2P (peer-to-peer) overlay architecture, called IGN
(interest grouping network), for contents lookup in the DHC (digital
home community), which aims to provide a formalized
home-network-extended construction of current P2P file sharing
community. The IGN utilizes the Chord and de Bruijn graph for its
hierarchical overlay network construction. By combining two schemes
and by inheriting its features, the IGN efficiently supports
contents lookup. More specifically, by introducing metadata-based
lookup keyword, the IGN offers detailed contents lookup that can
reflect the user interests. Moreover, the IGN tries to reflect home
network environments of DHC by utilizing HG (home gateway) of each
home network as a participating node of the IGN. Through
experimental and analysis results, we show that the IGN is more
efficient than Chord, a well-known DHT (distributed hash
table)-based lookup protocol.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper a joint source and space time decoding scheme is proposed for the high speed digital source
transmission over fading channels. At the transmitter Reverse Variable Length Code (RVLC) is concatenated
with recursive space time trellis code (recursive STTC). At the receiver iterative joint VLC and space time
decoding algorithm is proposed to fully utilize the residual redundance introduced in RVLC and the coding gain
of recursive space time trellis code. Simulation result shows that the proposed joint decoding system achieves a
better decoding performance over fading channels than separable decoding system.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Delivering streaming media content over the Internet is a very challenging problem. Proxy servers has been introduced into the streaming media delivery systems over the Internet, and many mechanisms have been proposed based on this structure, such as proxy caching and prefetching. While the existing techniques can improve the performance of accesses to reused media objects, they are not effective in reducing the startup delay for first-time accesses. In this paper, we propose a more aggressive server-assisted prefetching mechanism to reduce the startup delay of first-time accesses. In this aggressive prefetching mechanism, proxy servers prefetch media objects before they are requested. To ensure the accuracy of this beforehand prefetching, we make use of server's knowledge about access patterns to locate the most popular media objects and provide such information to proxy servers as hint for prefetching. A proxy server makes decision based on the hint and its users' profile and prefetches suitable objects before they are accessed. Results of trace-driven simulations show that our proposed mechanism can effectively reduce the ratio of delayed requests by up to 38% with very marginal increase in traffic.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The James Webb Space Telescope (JWST) is expected to produce a vast amount of images that are valuable for astronomical research and education. To support research activities related to the mission, the National Aeronautical and Space Administration (NASA) has provided funds to establish the Structures Pointing and Control Engineering (SPACE) Laboratory at the California State University, Los Angeles (CSULA). One of the research activities in SPACE lab is to design and implement an effective and efficient transmission system to disseminate JWST images across networks. In on our previous research, a prioritized transmission method was proposed to provide the best quality of the transferred image based on the joint-optimization of content-based retransmission and error concealment. In this paper, the design and implementation of a robust transmission system is presented to utilize our previously proposed methods over error-prone links. The implemented system includes three parts. First, a zero-tree based error-resilient wavelet codec is used to compress the incoming astronomical image at the sender. Tree-based interleaving is adopted in packetization to increase the system's capability to combat burst losses in error-prone channels. Second, various error concealment approaches are investigated and implemented at the receiver to improve the quality of the reconstructed image. The transmission system uses UDP as the transport protocol, but with an error control module to incorporate the optimal retransmission with the delay constraint. A user-friendly graphical interface is designed to allow easy usage for users of diverse backgrounds.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The ongoing trend of next-generation services including high-definition (HD) video streaming, benefitting from greater available bandwidth is the result of recent emergence of high-speed broadband Internet. The uncompressed HD video standard SMPTE-292M is widely used to interconnect HDTV equipments and requires about 1.5 Gbps bandwidth. In this paper we propose a dual-stream transport (DST) solution where data stream is split across two Gigabit network interfaces in order to aggregate their bandwidth using off-the-shelf network components. In addition, we propose a flow scheduling scheme to minimize the size of receiver buffer needed to absorb jitter and to improve performance. We note that the proposed solution does not require upgrades to existing network infrastructure, such as routers. Finally, the experiment results highlight the performance of our proposed implementation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The concept of a new tree-based architecture for networked multi-player games was proposed by Matuszek to improve
scalability in network traffic at the same time to improve reliability. The architecture (we refer it as "Tree-Based Server-
Middlemen-Client architecture") will solve the two major problems in ad-hoc wireless networks: frequent link failures
and significance in battery power consumption at wireless transceivers by using two new techniques, recursive
aggregation of client messages and subscription-based propagation of game state. However, the performance of the TBSMC
architecture has never been quantitatively studied. In this paper, the TB-SMC architecture is compared with the
client-server architecture using simulation experiments. We developed an event driven simulator to evaluate the
performance of the TB-SMC architecture. In the network traffic scalability experiments, the TB-SMC architecture
resulted in less than 1/14 of the network traffic load for 200 end users. In the reliability experiments, the TB-SMC
architecture improved the number of successfully delivered players' votes by 31.6, 19.0, and 12.4% from the clientserver
architecture at high (failure probability of 90%), moderate (50%) and low (10%) failure probability.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Currently, information is very important to Internet users. Unfortunately, searching for specific information from the
Internet is not easy as wishes. The existing search engine mechanisms cannot be performed using a pathname of URL as
a search key. Therefore, users who have a partial pathname of URL cannot use their knowledge to narrow down the
search results. Thus, users have to spend a long time searching for the required web site from the result list. This paper
proposes a search protocol named Information Searching Protocol (ISP) that supports the multiple search contents for
users who know a partial pathname of URL and keywords. Moreover, the architecture of the Global Search Engine
System (GSES) that cooperates with the ISP and is responsible for the search mechanism is also proposed. The GSES
consists of two separated parts: an Internet Search Protocol agent at the client site, and GSES components at the server
site. These components allow users to perform the search using a pathname of URL composing with keywords. The
functions of GSES components indicate that the ISP enhances the search mechanism. So, users receive more specific
URL and can, shortly, get access to the required site.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this work, we propose a novel 3-D mesh editing algorithm using motion features. First, a vertex-wise motion vector is defined between the corresponding vertex pair of two sample meshes. Then, we
extract the motion feature for each vertex, which represents the similarity of neighboring vertex-wise motion vectors on a local mesh
region. When anchor vertices are moved by external force, the mesh
geometry is deformed such that the motion feature of each vertex is
preserved to the greatest extent. Extensive simulation results on
various mesh models demonstrate that the proposed mesh deformation
scheme yields visually pleasing editing results.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The primary goal of the current research was to develop image categorization algorithms that are more consistent with users' search strategies for their personal image collections. Other goals were to provide users with the option of correcting and labeling these image groups and to understand user behaviors and needs while they are using an automated image-organization system. The main focus of this paper is to provide automatic organization of images by two of the most important semantic classes in the consumer domain-events and people. Methods are described for automatically producing meaningful groups of images whereby each group depicts an event as well as clusters of similar faces in users' collections. Given that the proposed system envisions user interaction and is intended for organizing and searching personal collections, a usability study focused on consumers was conducted to gauge the performance of the system.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we propose a joint source-channel coding scheme for progressive image transmission over binary
symmetric channels(BSCs). The algorithm of set partitioning in hierarchical trees (SPIHT) is used for source
coding. Rate-compatible punctured Turbo codes (RCPT) concatenated with multiple cyclic redundancy check
(CRC) codes are adopted for channel protection. For a fixed transmission rate, the source and channel code rates
are jointly optimized to maximize the expected image quality at the receiver. Two technical components which
are different from existing methods are presented. First, a long data packet is divided into multiple CRC blocks
before being coded by turbo codes. This is to secure a high coding gain of Turbo codes which is proportional
to the interleaver size. In the mean time, the beginning blocks in a packet may still be useable although the
decoding of the entire packet fails. Second, instead of using exhaustive search, we give a genetic algorithm (GA)
based optimization method to find the appropriate channel code rates with low complexity. The effectiveness of
the scheme is demonstrated through simulations.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A multiple description coding (MDC) technique for 3D surface geometry is proposed in this work. The encoder uses a plane-based representation to describe point samples. Then, those plane primitives are classified into two disjoint subsets or two descriptions, each of which provides equal contribution in 3D surface description. The two descriptions are compressed and transmitted over distinct channels. At the decoder, if both channels are available, the descriptions are decoded and merged together to reconstruct a high quality surface. If only one channel is available, we employ a surface interpolation method to fill visual holes and reconstruct a smooth surface. Therefore, the proposed algorithm can provide an acceptable reconstruction even though one channel is totally lost. Simulation results demonstrate that the proposed algorithm is a promising scheme for 3D data transmission over noisy channels.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A novel image quality assessment using the edge histogram descriptor (EHD) of MPEG-7 is presented. Neither additional data nor fragile watermarking is needed for the quality assessment and the image content authentication. Also, the original image is not needed for our method, no need to access the original image as a reference. Only the EHD metadata of the original image and the received (noisy or altered) one are required. The PSNR (Peak to Signal-to-Noise Ratio) or the mean-square error (MSE) is obtained by comparing the EHD extracted from the received image and that of the original image attached as the meta-data. Then, it is used to assess the level of the image degradation and any illicit modification of the image. Experimental results show that the PSNRs calculated from the two EHDs are similar to those calculated from the pixel-to-pixel comparisons of original and received images. This implies that one can use the EHD, instead of the image data, to calculate the PSNR for the image assessment. Also, since the EHD extracted from the received image is prone to be changed according to the alterations of the image content, one can also use the proposed method as the image authentication purpose.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The authors developed the revolving lantern using images of the holographic display. Our revolving lantern playbacks the virtual 3D images which are floating in the air. These spatial images have motions and interactive changes. The prototype imaging unit consists of the hologram, turn table and illumination system which can change the color of light so as to reconstruct various spatial images. In this paper, we describe the spatial imaging with a holographic technology and the reconstruction system which playbacks the rotating motion and various 3D images. A hologram playbacks 3D images. These reconstructions are generally static images. The rotating image like a revolving lantern can be produced when a hologram is spinning on the turn table. A hologram can record and reconstruct various images using the different wavelength of laser beam and illumination. When the illumination system changes the color of illumination light, a hologram reconstructs other images.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Sponsored by the National Aeronautical Space Association (NASA), the Synergetic Education and Research in Enabling NASA-centered Academic Development of Engineers and Space Scientists (SERENADES) Laboratory was established at California State University, Los Angeles (CSULA). An important on-going research activity in this lab is to develop an easy-to-use image analysis software with the capability of automated object detection to facilitate astronomical research. This paper presents the design and implementation of an automated astronomical image analyzer. The core of this software is the automated object detection algorithm developed in our previous research, which is capable of detecting objects in near galaxy images, including objects located within clouds. In addition to the functionality, human factor is considered in system design and tremendous efforts have been devoted to enhance the user friendliness. Instead of using command line or static menus, graphical methods are enabled in our software system to allow the user to directly manipulate the objects that he/she wants to investigate. Comprehensive tests are conducted by users with and without astronomical backgrounds. Compared to current software tools such as IRAF and Skyview, our developed software has the following advantages: 1) No pre-training is required; 2) The amount of human supervision is significantly reduced by automated object detection; 3) Batch processing capability is supported for fast operation; and 4) A high degree of human computer interaction is realized for better usability.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In recent years, IT-based production and archiving of media has matured to a level which enables broadcasters to switch
over from tape- or CD-based to file-based workflows for the production of their radio and television programs. This
technology is essential for the future of broadcasters as it provides the flexibility and speed of execution the customer
demands by enabling, among others, concurrent access and production, faster than real-time ingest, edit during ingest,
centrally managed annotation and quality preservation of media. In terms of automation of program production, the radio
department is the most advanced within the VRT, the Flemish broadcaster. Since a couple of years ago, the radio
department has been working with digital equipment and producing its programs mainly on standard IT equipment.
Historically, the shift from analogue to digital based production has been a step by step process initiated and coordinated
by each radio station separately, resulting in a multitude of tools and metadata collections, some of them developed
in-house, lacking integration. To make matters worse, each of those stations adopted a slightly different production
methodology. The planned introduction of a company-wide Media Asset Management System allows a coordinated
overhaul to a unified production architecture. Benefits include the centralized ingest and annotation of audio material and
the uniform, integrated (in terms of IT infrastructure) workflow model. Needless to say, the ingest strategy, metadata
management and integration with radio production systems play a major role in the level of success of any improvement
effort. This paper presents a data model for audio-specific concepts relevant to radio production. It includes an
investigation of ingest techniques and strategies. Cooperation with external, professional production tools is
demonstrated through a use-case scenario: the integration of an existing, multi-track editing tool with a commercially
available Media Asset Management System. This will enable an uncomplicated production chain, with a recognizable
look and feel for all system users, regardless of their affiliated radio station, as well as central retrieval and storage of
information and metadata.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents a highlight extraction scheme for sports videos. The approach makes use of the transition
logos inserted preceding and following the slow motion replays by the broadcaster, which demonstrate highlights
of the game. First, the features of a MPEG compressed video are retrieved for subsequent processing. After the
shot boundary detection procedure, the processing units are formed and the units with fast moving scenes are
then selected. Finally, the detection of overlaying objects is performed to signal the appearance of a transition
logo. Experimental results show the feasibility of this promising method for sports videos highlight extraction.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we proposed a scheme for TV news segmentation via exploring the efficient visual features. The proposed scheme can be divided into three parts, such as shot change detection based on skin color; probable anchorperson shot detection and anchorperson detection. According to experimental results, our proposed method can efficiently decompose TV news into anchorperson shots and report shots. Compared to the traditional face detection methods, the proposed method can robustly exclude the non-anchorperson shots in report shots such as the interview scenes. Experimental results are given to demonstrate the feasibility and efficiency of the proposed technique.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents a multi-modal two-level framework for news story segmentation designed to cope with large news video corpus such as the data used in TREC video retrieval (TRECVID) evaluations. We divide our system into two levels: shot level that assigns one of the pre-defined semantic tags to each input shot; and story level that performs story segmentation based on the output of the shot level and other temporal features. We demonstrate the generality of our framework by employing two machine-learning approaches at the story level. The first approach employs a statistical method called Hidden Markov Models (HMM) whereas the second uses a rule induction technique. We tested both approaches on ~ 120 hours of news video provided by TRECVID 2003. The results demonstrate that our 2-level machine-learning framework is effective and is adequate to cope with large-scale practical problems.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Multimedia Adaptation and Collaborative Environment
Virtual Reality simulation enables immersive 3D experience of a Virtual Environment. A simulation-based Virtual
Environment can be used to map real world phenomena onto virtual experience. With a reconfigurable simulation, users
can reconfigure the parameters of the involved objects, so that they can see different effects from the different
configurations. This concept is suitable for a classroom learning of physics law. This research studies the Virtual Reality
simulation of Newton's physics law on rigid body type of objects. With network support, collaborative interaction is
enabled so that people from different places can interact with the same set of objects in immersive Collaborative Virtual
Environment. The taxonomy of the interaction in different levels of collaboration is described as: distinct objects and
same object, in which there are same object - sequentially, same object - concurrently - same attribute, and same object
- concurrently - distinct attributes. The case studies are the interaction of users in two cases: destroying and creating a
set of arranged rigid bodies. In Virtual Domino, users can observe physics law while applying force to the domino blocks
in order to destroy the arrangements. In Virtual Dollhouse, users can observe physics law while constructing a dollhouse
using existing building blocks, under gravity effects.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Multi-party collaborative environments based on AG (Access Grid) are extensively utilized for distance learning,
e-science, and other distributed global collaboration events. In such environments, A/V media services play
an important role in providing QoE (quality of experience) to participants in collaboration sessions. In this
paper, in order to support high-quality user experience in the aspect of video services, we design an integration
architecture to combine high-quality video services and a high-resolution tiled display service. In detail, the
proposed architecture incorporates video services for DV (digital video) and HDV (high-definition digital video)
streaming with a display service to provide methods for decomposable decoding/display for a tiled display
system. By implementing the proposed architecture on top of AG, we verify that high-quality collaboration
among a couple of collaboration sites can be realized over a multicast-enabled network testbed with improved
media quality experience.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The authors describe 360 degree viewing display that can be viewed from any direction. This new viewing system has an EL screen and a rotating table. The principle is very simple. The screen of a monitor rotates at a uniform speed. Then an observer can view a monitor screen at any position surrounding the round table. But the solid of revolution is formed when the image screen is rotated. Hence the angle of view is controlled by a slit or an optical element in order that the screen faces an observer and he can view an only 2D image on the screen without a 3D solid image.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The Access Grid (AG) is a multi-party collaboration environment, where efficient media exchanges among remote
participants are supported by IP multicast. However, due to limited availability of native IP multicast service,
application-assisted multicast connectivity solutions are required in real operation of the AG. Recently, we have
proposed a UMTP (UDP Multicast Tunneling Protocol) based solution, named as AG Connector, which takes
advantage of UDP-based tunneling approach. In this paper, we extend this original AG Connector solution by
addressing multiple UMTP server coordination and other operational issues. The proposed extended multicast
connectivity solution can support increased number of AG nodes that even include nodes under NAT/Firewall-based
private networks.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
As digital broadcasting technologies have been rapidly progressed, users' expectations for realistic and interactive
broadcasting services also have been increased. As one of such services, 3D multi-view broadcasting has received much
attention recently. In general, all the view sequences acquired at the server are transmitted to the client. Then, the user
can select a part of views or all the views according to display capabilities. However, this kind of system requires high
processing power of the server as well as the client, thus posing a difficulty in practical applications. To overcome this
problem, a relatively simple method is to transmit only two view-sequences requested by the client in order to deliver a
stereoscopic video. In this system, effective communication between the server and the client is one of important
aspects.
In this paper, we propose an efficient multi-view system that transmits two view-sequences and their depth maps
according to user's request. The view selection process is integrated into MPEG-21 DIA (Digital Item Adaptation) so
that our system is compatible to MPEG-21 multimedia framework. DIA is generally composed of resource adaptation
and descriptor adaptation. It is one of merits that SVA (stereoscopic video adaptation) descriptors defined in DIA
standard are used to deliver users' preferences and device capabilities. Furthermore, multi-view descriptions related to
multi-view camera and system are newly introduced. The syntax of the descriptions and their elements is represented in
XML (eXtensible Markup Language) schema. If the client requests an adapted descriptor (e.g., view numbers) to the
server, then the server sends its associated view sequences. Finally, we present a method which can reduce user's visual
discomfort that might occur while viewing stereoscopic video. This phenomenon happens when view changes as well as
when a stereoscopic image produces excessive disparity caused by a large baseline between two cameras. To solve for
the former, IVR (intermediate view reconstruction) is employed for smooth transition between two stereoscopic view
sequences. As well, a disparity adjustment scheme is used for the latter.
Finally, from the implementation of testbed and the experiments, we can show the valuables and possibilities of our
system.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Although digital watermarking can be considered one of the key technologies to implement the copyright protection of digital contents distributed on the Internet, most of the content distribution models based on watermarking protocols proposed in literature have been purposely designed for fixed networks and cannot be easily adapted to mobile networks. On the contrary, the use of mobile devices currently enables new types of services and business models, and this makes the development of new content distribution models for mobile environments strategic in the current scenario of the Internet. This paper presents and discusses a distribution model of watermarked digital contents for such environments able to achieve a trade-off between the needs of efficiency and security.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we propose a new scheme for blind watermarking of three-dimensional (3D) point clouds in the QSplat representation. The proposed watermarking algorithm can support the authentication, the proof of ownership, and the copyright protection of 3D data. We apply the quantization index modulation (QIM) to QSplat position data, such that the quantization indices of points are mapped to either even or odd set according to the watermark. The same watermark is repeatedly embedded into a cluster of the 3D model at a low resolution to guarantee the robustness of the watermark. At the decoder, the watermark is extracted in a blind manner without requiring the original model. Experimental results show that the proposed watermarking algorithm is robust against numerous attacks, including additive random noises, translation, cropping, simplification and their combinations.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.