PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.
This PDF file contains the front matter associated with SPIE Proceedings Volume 8437, including the Title Page, Copyright information, Table of Contents, and the Conference Committee listing.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The trend towards computers with multiple processing units keeps going with no end in sight. Modern consumer
computers come with 2 - 6 processing units. Programming methods have been unable to keep up with this fast
development. In this paper we present a framework that uses a dataflow model for parallel processing: the Generic
Parallel Rapid Development Toolkit, GePaRDT. This intuitive programming model eases the concurrent usage
of many processing units without specialized knowledge about parallel programming methods and it's pitfalls.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A novel approach in the segmentation for the images of different nature employing the feature extraction in WT space
before the segmentation process is presented. The designed frameworks (W-FCM, W-CPSFCM and WK-Means)
according to AUC analysis have demonstrated better performance novel frameworks against other algorithms existing in
literature during numerous simulation experiments with synthetic and dermoscopic images. The novel W-CPSFCM
algorithm estimates a number of clusters in automatic mode without the intervention of a specialist. The implementation
of the proposed segmentation algorithms on the Texas Instruments DSP TMS320DM642 demonstrates possible real time
processing mode for images of different nature.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In recent years, real-time video communication over the internet has been widely utilized for applications like video
conferencing. Streaming live video over heterogeneous IP networks, including wireless networks, requires video coding
algorithms that can support various levels of quality in order to adapt to the network end-to-end bandwidth and
transmitter/receiver resources. In this work, a scalable video coding and compression algorithm based on the Contourlet
Transform is proposed. The algorithm allows for multiple levels of detail, without re-encoding the video frames, by just
dropping the encoded information referring to higher resolution than needed. Compression is achieved by means of lossy
and lossless methods, as well as variable bit rate encoding schemes. Furthermore, due to the transformation utilized, it
does not suffer from blocking artifacts that occur with many widely adopted compression algorithms. Another highly
advantageous characteristic of the algorithm is the suppression of noise induced by low-quality sensors usually
encountered in web-cameras, due to the manipulation of the transform coefficients at the compression stage. The
proposed algorithm is designed to introduce minimal coding delay, thus achieving real-time performance. Performance is
enhanced by utilizing the vast computational capabilities of modern GPUs, providing satisfactory encoding and decoding
times at relatively low cost. These characteristics make this method suitable for applications like video-conferencing that
demand real-time performance, along with the highest visual quality possible for each user. Through the presented
performance and quality evaluation of the algorithm, experimental results show that the proposed algorithm achieves
better or comparable visual quality relative to other compression and encoding methods tested, while maintaining a
satisfactory compression ratio. Especially at low bitrates, it provides more human-eye friendly images compared to
algorithms utilizing block-based coding, like the MPEG family, as it introduces fuzziness and blurring instead of
artificial block artifacts.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A real-time iris detection and tracking algorithm has been implemented on a smart camera using LabVIEW graphical
programming tools. The program detects the eye and finds the center of the iris, which is recorded and stored in
Cartesian coordinates. In subsequent video frames, the location of the center of the iris corresponding to the previously
detected eye is computed and recorded for a desired period of time, creating a list of coordinates representing the
moving iris center location across image frames. We present an application for the developed smart camera iris tracking
system that involves the assessment of reading patterns. The purpose of the study is to identify differences in reading
patterns of readers at various levels to eventually determine successful reading strategies for improvement. The readers
are positioned in front of a computer screen with a fixed camera directed at the reader's eyes. The readers are then asked
to read preselected content on the computer screen, one comprising a traditional newspaper text and one a Web page.
The iris path is captured and stored in real-time. The reading patterns are examined by analyzing the path of the iris
movement. In this paper, the iris tracking system and algorithms, application of the system to real-time capture of
reading patterns, and representation of 2D/3D iris track are presented with results and recommendations.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper introduces a new method for fast calibration of inertial measurement units (IMU) with cameras being rigidly
coupled. That is, the relative rotation and translation between the IMU and the camera is estimated, allowing for the
transfer of IMU data to the cameras coordinate frame. Moreover, the IMUs nuisance parameters (biases and scales) and
the horizontal alignment of the initial camera frame are determined. Since an iterated Kalman Filter is used for estimation,
information on the estimations precision is also available. Such calibrations are crucial for IMU-aided visual robot
navigation, i.e. SLAM, since wrong calibrations cause biases and drifts in the estimated position and orientation. As the
estimation is performed in realtime, the calibration can be done using a freehand movement and the estimated parameters
can be validated just in time. This provides the opportunity of optimizing the used trajectory online, increasing the quality
and minimizing the time effort for calibration. Except for a marker pattern, used for visual tracking, no additional hardware
is required.
As will be shown, the system is capable of estimating the calibration within a short period of time. Depending on
the requested precision trajectories of 30 seconds to a few minutes are sufficient. This allows for calibrating the system
at startup. By this, deviations in the calibration due to transport and storage can be compensated. The estimation quality
and consistency are evaluated in dependency of the traveled trajectories and the amount of IMU-camera displacement and
rotation misalignment. It is analyzed, how different types of visual markers, i.e. 2- and 3-dimensional patterns, effect the
estimation. Moreover, the method is applied to mono and stereo vision systems, providing information on the applicability
to robot systems. The algorithm is implemented using a modular software framework, such that it can be adopted to altered
conditions easily.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Phase microscopy techniques regained interest in allowing for the observation of unprepared specimens
with excellent temporal resolution. Tomographic diffractive microscopy is an extension of
holographic microscopy which permits 3D observations with a finer resolution than incoherent light
microscopes. Specimens are imaged by a series of 2D holograms: their accumulation progressively
fills the range of frequencies of the specimen in Fourier space. A 3D inverse FFT eventually provides
a spatial image of the specimen.
Consequently, acquisition then reconstruction are mandatory to produce an image that could prelude
real-time control of the observed specimen. The MIPS Laboratory has built a tomographic
diffractive microscope with an unsurpassed 130nm resolution but a low imaging speed - no less than
one minute. Afterwards, a high-end PC reconstructs the 3D image in 20 seconds. We now expect
an interactive system providing preview images during the acquisition for monitoring purposes.
We first present a prototype implementing this solution on CPU: acquisition and reconstruction are
tied in a producer-consumer scheme, sharing common data into CPU memory. Then we present
a prototype dispatching some reconstruction tasks to GPU in order to take advantage of SIMDparallelization
for FFT and higher bandwidth for filtering operations. The CPU scheme takes 6
seconds for a 3D image update while the GPU scheme can go down to 2 or > 1 seconds depending
on the GPU class. This opens opportunities for 4D imaging of living organisms or crystallization
processes. We also consider the relevance of GPU for 3D image interaction in our specific conditions.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Real-time image and video processing applications require skilled architects, and recent trends in the hardware
platform make the design and implementation of these applications increasingly complex. Many frameworks and
libraries have been proposed or commercialized to simplify the design and tuning of real-time image processing
applications. However, they tend to lack flexibility because they are normally oriented towards particular types
of applications, or they impose specific data processing models such as the pipeline. Other issues include large
memory footprints, difficulty for reuse and inefficient execution on multicore processors. This paper presents a
novel software architecture for real-time image and video processing applications which addresses these issues.
The architecture is divided into three layers: the platform abstraction layer, the messaging layer, and the
application layer. The platform abstraction layer provides a high level application programming interface for
the rest of the architecture. The messaging layer provides a message passing interface based on a dynamic
publish/subscribe pattern. A topic-based filtering in which messages are published to topics is used to route
the messages from the publishers to the subscribers interested in a particular type of messages. The application
layer provides a repository for reusable application modules designed for real-time image and video processing
applications. These modules, which include acquisition, visualization, communication, user interface and data
processing modules, take advantage of the power of other well-known libraries such as OpenCV, Intel IPP,
or CUDA. Finally, we present different prototypes and applications to show the possibilities of the proposed
architecture.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents a stereo image matching system that takes advantage of a global image matching method. The system
is designed to provide depth information for mobile robotic applications. Typical tasks of the proposed system are to assist
in obstacle avoidance, SLAM and path planning. Mobile robots pose strong requirements about size, energy consumption,
reliability and output quality of the image matching subsystem. Current available systems either rely on active sensors or
on local stereo image matching algorithms. The first are only suitable in controlled environments while the second suffer
from low quality depth-maps. Top ranking quality results are only achieved by an iterative approach using global image
matching and color segmentation techniques which are computationally demanding and therefore difficult to be executed
in realtime. Attempts were made to still reach realtime performance with global methods by simplifying the routines. The
depth maps are at the end almost comparable to local methods. An equally named semi-global algorithm was proposed
earlier that shows both very good image matching results and relatively simple operations. A memory efficient variant of
the Semi-Global-Matching algorithm is reviewed and adopted for an implementation based on reconfigurable hardware.
The implementation is suitable for realtime execution in the field of robotics. It will be shown that the modified version of
the efficient Semi-Global-Matching method is delivering equivalent result compared to the original algorithm based on the
Middlebury dataset.
The system has proven to be capable of processing VGA sized images with a disparity resolution of 64 pixel at
33 frames per second based on low cost to mid-range hardware. In case the focus is shifted to a higher image resolution,
1024×1024-sized stereo frames may be processed with the same hardware at 10 fps. The disparity resolution settings
stay unchanged. A mobile system that covers preprocessing, matching and interfacing operations is also presented.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
An important task in film and video preservation is the quality assessment of the content to be archived or reused out of
the archive. This task, if done manually, is a straining and time consuming process, so it is highly recommended to
automate this process as far as possible. In this paper, we show how to port a previously proposed algorithm for detection
of severe analog and digital video distortions (termed "video breakup"), efficiently to NVIDIA GPUs of the Fermi
Architecture with CUDA. By parallizing of the algorithm massively in order to take usage of the hundreds of cores on a
typical GPU and careful usage of GPU features like atomic functions, texture and shared memory, we achive a speedup
of roughly 10-15 when comparing the GPU implementation with an highly optimized, multi-threaded CPU
implementation. Thus our GPU algorithm is able to analyze nine Full HD (1920 × 1080) video streams or 40 standard
definition (720 × 576) video streams in real-time on a single inexpensive Nvidia Geforce GTX 480 GPU. Additionally,
we present the AV-Inspector application for video quality analysis where the video breakup algorithm has been
integrated.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
There are a number of challenges caused by the large amount of data and limited resources when implementing vision
systems on wireless smart cameras using embedded platforms. Generally, the common challenges include limited
memory, processing capability, the power consumption in the case of battery operated systems, and bandwidth. It is
usual for research in this field to focus on the development of a specific solution for a particular problem. In order to
implement vision systems on an embedded platform, the designers must firstly investigate the resource requirements for
a design and, indeed, failure to do this may result in additional design time and costs so as to meet the specifications.
There is a requirement for a tool which has the ability to predict the resource requirements for the development and
comparison of vision solutions in wireless smart cameras. To accelerate the development of such tool, we have used a
system taxonomy, which shows that the majority of vision systems for wireless smart cameras are common and these
focus on object detection, analysis and recognition. In this paper, we have investigated the arithmetic complexity and
memory requirements of vision functions by using the system taxonomy and proposed an abstract complexity model. To
demonstrate the use of this model, we have analysed a number of implemented systems with this model and showed that
complexity model together with system taxonomy can be used for comparison and generalization of vision solutions.
The study will assist researchers/designers to predict the resource requirements for different class of vision systems,
implemented on wireless smart cameras, in a reduced time and which will involve little effort. This in turn will make the
comparison and generalization of solutions simple for wireless smart cameras.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Work towards the standardisation of High Efficiency Video Coding (HEVC), the next generation video coding scheme,
is currently gaining pace. HEVC offers the prospect of a 50% improvement in compression over the current H.264
Advanced Video Coding standard (H.264/AVC). Thus far, work on HEVC has concentrated on improvements to the
coding efficiency and has not yet addressed transmission in networks other than to mandate byte stream compliance with
Annex B of H.264/AVC. For practical networked HEVC applications a number of essential building blocks have yet to
be defined. In this work, we design and prototype a real-time HEVC streaming system and empirically evaluate its
performance, in particular we consider the robustness of the current Test Model under Consideration (TMuC HM4.0) for
HEVC to packet loss caused by a reduction in available bandwidth both in terms of decoder resilience and degradation
in perceptual video quality.
A NAL unit packetisation and streaming framework for HEVC encoded video streams is designed, implemented and
empirically tested in a number of streaming environments including wired, wireless, single path and multiple path
network scenarios. As a first step the HEVC decoder's error resilience is tested under a comprehensive set of packet loss
conditions and a simple error concealment method for HEVC is implemented. Similarly to H.264 encoded streams, the
size and distribution of NAL units within an HEVC stream and the nature of the NAL unit dependencies influences the
packetisation and streaming strategies which may be employed for such streams. The relationships between HEVC
encoding mode and the quality of the received video are shown under a wide range of bandwidth constraints. HEVC
streaming is evaluated in both single and multipath network configuration scenarios.
Through the use of extensive experimentation, we establish a comprehensive set of benchmarks for HEVC streaming in
loss prone network environments. We show the visual quality reduction in terms of PSNR which results from a reduction
in available bandwidth. To the best of our knowledge, this is the first time that such a fully functional streaming system
for HEVC, together with the benchmark evaluation results, has been reported. This study will open up more timely
research opportunities in this cutting edge area.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this study, we develop a real-time, color histogram-based tracking system for multiple color-patterned objects
in a 512×512 image at 2000 fps. Our system can simultaneously extract the positions, areas, orientation angles,
and color histograms of multiple objects in an image using the hardware implementation of a multi-object,
color histogram extraction circuit module on a high-speed vision platform. It can both label multiple objects
in an image consisting of connected components and calculate their moment features and 16-bin hue-based
color histograms using cell-based labeling. We demonstrate the performance of our system by showing several
experimental results: (1) tracking of multiple color-patterned objects on a plate rotating at 16 rps, and (2)
tracking of human hand movement with two color-patterned drinking bottles.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Image de-noising has been a well studied problem in the field of digital image processing. However there are a number
of problems, preventing state-of-the-art algorithms finding their way to practical implementations. In our research we
have solved these issues with an implementation of a practical de-noising algorithm. In order of importance: firstly we
have designed a robust algorithm, tackling different kinds of nose in a very wide range of signal to noise ratios, secondly
in our algorithm we tried to achieve natural looking processed images and to avoid unnatural looking artifacts, thirdly we
have designed the algorithm to be suitable for implementation in commercial grade FPGA's capable of processing full
HD (1920×1080) video data in real time (60 frame per second).
The main challenge for the use of noise reduction algorithms in photo and video applications is the compromise
between the efficiency of the algorithm (amount of PSNR improvement), loss of details, appearance of artifacts and the
complexity of the algorithm (and consequentially the cost of integration). In photo and video applications it is very
important that the residual noise and artifacts produced by the noise reduction algorithm should look natural and do not
distract aesthetically. Our proposed algorithm does not produce artificially looking defects found in existing state-of-theart
algorithms.
In our research, we propose a robust and fast non-local de-noising algorithm. The algorithm is based on a Laplacian
pyramid. The advantage of this approach is the ability to build noise reduction algorithms with a very large effective
kernel. In our experiments effective kernel sizes as big as 127×127 pixels were used in some cases, which only required
4 scales. This size of a kernel was required to perform noise reduction for the images taken with a DSLR camera.
Taking into account the achievable improvement in PSNR (on the level of the best known noise reduction
techniques) and low algorithmic complexity, enabling its practical use in commercial photo, video applications, the
results of our research can be very valuable.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Hyperspectral image compression is an important task in remotely sensed Earth Observation as the dimensionality
of this kind of image data is ever increasing. This requires on-board compression in order to optimize the
donwlink connection when sending the data to Earth. A successful algorithm to perform lossy compression of
remotely sensed hyperspectral data is the iterative error analysis (IEA) algorithm, which applies an iterative
process which allows controlling the amount of information loss and compression ratio depending on the number
of iterations. This algorithm, which is based on spectral unmixing concepts, can be computationally expensive
for hyperspectral images with high dimensionality. In this paper, we develop a new parallel implementation of
the IEA algorithm for hyperspectral image compression on graphics processing units (GPUs). The proposed
implementation is tested on several different GPUs from NVidia, and is shown to exhibit real-time performance
in the analysis of an Airborne Visible Infra-Red Imaging Spectrometer (AVIRIS) data sets collected over different
locations. The proposed algorithm and its parallel GPU implementation represent a significant advance towards
real-time onboard (lossy) compression of hyperspectral data where the quality of the compression can be also
adjusted in real-time.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Jose Augusto Stuchi, Elisa Signoreto Barbarini, Flavio Pascoal Vieira, Daniel dos Santos Jr., Mário Antonio Stefani, Fatima Maria Mitsue Yasuoka, Jarbas C. Castro Neto, Evandro Luis Linhari Rodrigues
The need of methods and tools that assist in determining the performance of optical systems is actually increasing. One
of the most used methods to perform analysis of optical systems is to measure the Modulation Transfer Function (MTF).
The MTF represents a direct and quantitative verification of the image quality. This paper presents the implementation of
the software, in order to calculate the MTF of electro-optical systems. The software was used for calculating the MTF of
Digital Fundus Camera, Thermal Imager and Ophthalmologic Surgery Microscope. The MTF information aids the
analysis of alignment and measurement of optical quality, and also defines the limit resolution of optical systems. The
results obtained with the Fundus Camera and Thermal Imager was compared with the theoretical values. For the
Microscope, the results were compared with MTF measured of Microscope Zeiss model, which is the quality standard of
ophthalmological microscope.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Polymerisation induced shrinkage is one of the main reasons why photopolymer materials are not more widely used for
holographic applications. The aim of this study is to evaluate the shrinkage in an acrylamide photopolymer layer during
holographic recording using holographic interferometry. Shrinkage in photopolymer layers can be measured by real time
capture of holographic interferograms during holographic recording. Interferograms were captured using a CMOS
camera at regular intervals. The optical path length change and hence the shrinkage were determined from the captured
fringe patterns. It was observed that the photopolymer layer shrinkage is in the order of 3.5%.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this study, we develop a real-time, structured light 3D scanner that can output 3D video of 512×512 pixels at
500 fps using a GPU-based, high-speed vision system synchronized with a high-speed DLP projector. Our 3D
scanner projects eight pairs of positive and negative image patterns with 8-bit gray code on the measurement
objects at 1000 fps. Synchronized with the high-speed vision platform, these images are simultaneously captured
at 1000 fps and processed in real time for 3D image generation at 500 fps by introducing parallel pixel processing
on a NVIDIA Tesla 1060 GPU board. Several experiments are performed for high-speed 3D objects that undergo
sudden 3D shape deformation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper we discuss certain recently developed invariant geometric techniques that can be used for fast
object recognition or fast image understanding. The results make use of techniques from algebraic geometry
that allow one to relate the geometric invariants of a feature set in 3D to similar invariants in 2D or 1D. The
methods apply equally well to optical images or radar images. In addition to the "object/image" equations
relating these invariants, we also discuss certain invariant metrics and show why they provide a more natural
and robust test for matching object features to image features. Additional aspects of the work as it applies to
shape reconstruction and shape statistics will also be explored.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Wireless Visual Sensor Network (WVSN) is an emerging field which combines image sensor, on board
computation unit, communication component and energy source. Compared to the traditional wireless sensor
network, which operates on one dimensional data, such as temperature, pressure values etc., WVSN operates on
two dimensional data (images) which requires higher processing power and communication bandwidth.
Normally, WVSNs are deployed in areas where installation of wired solutions is not feasible. The energy budget
in these networks is limited to the batteries, because of the wireless nature of the application. Due to the limited
availability of energy, the processing at Visual Sensor Nodes (VSN) and communication from VSN to server
should consume as low energy as possible. Transmission of raw images wirelessly consumes a lot of energy and
requires higher communication bandwidth. Data compression methods reduce data efficiently and hence will be
effective in reducing communication cost in WVSN. In this paper, we have compared the compression
efficiency and complexity of six well known bi-level image compression methods. The focus is to determine the
compression algorithms which can efficiently compress bi-level images and their computational complexity is
suitable for computational platform used in WVSNs. These results can be used as a road map for selection of
compression methods for different sets of constraints in WVSN.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we present a novel algorithm to motion detection in video sequences. The proposed algorithm is based in
the use of the median of the absolute deviations from the median (MAD) as a measure of statistical dispersion of pixels
in a video sequence to provide the robustness needed to detect motion in a frame of video sequence. By using the MAD,
the proposed algorithm is able to detect small or big objects, the size of the detected objects depend of the size of kernel
used in the analysis of the video sequence. Experimental results in the human motion detection are presented showing
that the proposed algorithm can be used in security applications.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
To address the computational complexity of the wavelet packet transform of a moving window with a large amount of
overlap between consecutive windows, the recursive computation approach was introduced previously1. In this work,
this approach is extended to 2D or images. In addition, the FPGA implementation of the recursive approach for updating
wavelet coefficients is performed by using the LabVIEW FPGA module. This programming approach is graphical and
requires no knowledge of relatively involved hardware description languages. A number of optimization steps including
both filter and wavelet stage pipelining are taken in order to achieve a real-time throughput. It is shown that the recursive
approach reduces the computational complexity significantly as compared to the non-recursive or the classical
computation of wavelet packet transform. For example, the number of multiplications is reduced by a factor of 3 for a
3-stage 1D transform of moving windows containing 128 samples and by a factor of 12 for a 3-stage 2D transform of
moving window blocks of size 16×16 with 50% overlap.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
When natural disasters or other large-scale incidents occur, obtaining accurate and timely information on the developing
situation is vital to effective disaster recovery operations.
High-quality video streams and high-resolution images, if
available in real time, would provide an invaluable source of current situation reports to the incident management team.
Meanwhile, a disaster often causes significant damage to the communications infrastructure. Therefore, another essential
requirement for disaster management is the ability to rapidly deploy a flexible incident area communication network.
Such a network would facilitate the transmission of real-time video streams and still images from the disrupted area to
remote command and control locations.
In this paper, a comprehensive end-to-end video/image transmission system between an incident area and a remote
control centre is proposed and implemented, and its performance is experimentally investigated. In this study a hybrid
multi-segment communication network is designed that seamlessly integrates terrestrial wireless mesh networks
(WMNs), distributed wireless visual sensor networks, an airborne platform with video camera balloons, and a Digital
Video Broadcasting- Satellite (DVB-S) system.
By carefully integrating all of these rapidly deployable, interworking and collaborative networking technologies, we can
fully exploit the joint benefits provided by WMNs, WSNs, balloon camera networks and DVB-S for real-time video
streaming and image delivery in emergency situations among the disaster hit area, the remote control centre and the
rescue teams in the field. The whole proposed system is implemented in a proven simulator. Through extensive
simulations, the real-time visual communication performance of this integrated system has been numerically evaluated,
towards a more in-depth understanding in supporting high-quality visual communications in such a demanding context.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Recently, the concept of Mobile Cloud Computing (MCC) has been proposed to offload the resource requirements in
computational capabilities, storage and security from mobile devices into the cloud. Internet video applications such as
real-time streaming are expected to be ubiquitously deployed and supported over the cloud for mobile users, who
typically encounter a range of wireless networks of diverse radio access technologies during their roaming. However,
real-time video streaming for mobile cloud users across heterogeneous wireless networks presents multiple challenges.
The network-layer quality of service (QoS) provision to support high-quality mobile video delivery in this demanding
scenario remains an open research question, and this in turn affects the application-level visual quality and impedes
mobile users' perceived quality of experience (QoE).
In this paper, we devise a framework to support real-time video streaming in this new mobile video networking paradigm
and evaluate the performance of the proposed framework empirically through a lab-based yet realistic testing platform.
One particular issue we focus on is the effect of users' mobility on the QoS of video streaming over the cloud. We design
and implement a hybrid platform comprising of a test-bed and an emulator, on which our concept of mobile cloud
computing, video streaming and heterogeneous wireless networks are implemented and integrated to allow the testing of
our framework. As representative heterogeneous wireless networks, the popular WLAN (Wi-Fi) and MAN (WiMAX)
networks are incorporated in order to evaluate effects of handovers between these different radio access technologies.
The H.264/AVC (Advanced Video Coding) standard is employed for real-time video streaming from a server to mobile
users (client nodes) in the networks. Mobility support is introduced to enable continuous streaming experience for a
mobile user across the heterogeneous wireless network. Real-time video stream packets are captured for analytical
purposes on the mobile user node. Experimental results are obtained and analysed. Future work is identified towards
further improvement of the current design and implementation.
With this new mobile video networking concept and paradigm implemented and evaluated, results and observations
obtained from this study would form the basis of a more in-depth, comprehensive understanding of various challenges
and opportunities in supporting high-quality real-time video streaming in mobile cloud over heterogeneous wireless
networks.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A Visual Sensor Network (VSN) is a network of spatially distributed cameras. The primary difference between VSN and
other type of sensor networks is the nature and volume of information. A VSN generally consists of cameras,
communication, storage and central computer, where image data from multiple cameras is processed and fused. In this
paper, we use optimization techniques to reduce the cost as derived by a model of a VSN to track large birds, such as
Golden Eagle, in the sky. The core idea is to divide a given monitoring range of altitudes into a number of sub-ranges of
altitudes. The sub-ranges of altitudes are monitored by individual VSNs, VSN1 monitors lower range, VSN2 monitors
next higher and so on, such that a minimum cost is used to monitor a given area. The VSNs may use similar or different
types of cameras but different optical components, thus, forming a heterogeneous network. We have calculated the cost
required to cover a given area by considering an altitudes range as single element and also by dividing it into sub-ranges.
To cover a given area with given altitudes range, with a single VSN requires 694 camera nodes in comparison to
dividing this range into sub-ranges of altitudes, which requires only 88 nodes, which is 87% reduction in the cost.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
3D video content is captured and created mainly in high resolution targeting big cinema or home TV screens. For 3D
mobile devices, equipped with small-size auto-stereoscopic displays, such content has to be properly repurposed,
preferably in real-time. The repurposing requires not only spatial resizing but also properly maintaining the output stereo
disparity, as it should deliver realistic, pleasant and harmless 3D perception.
In this paper, we propose an approach to adapt the disparity range of the source video to the comfort disparity zone of
the target display. To achieve this, we adapt the scale and the aspect ratio of the source video. We aim at maximizing the
disparity range of the retargeted content within the comfort zone, and minimizing the letterboxing of the cropped
content.
The proposed algorithm consists of five stages. First, we analyse the display profile, which characterises what 3D
content can be comfortably observed in the target display. Then, we perform fast disparity analysis of the input
stereoscopic content. Instead of returning the dense disparity map, it returns an estimate of the disparity statistics (min,
max, meanand variance) per frame. Additionally, we detect scene cuts, where sharp transitions in disparities occur.
Based on the estimated input, and desired output disparity ranges, we derive the optimal cropping parameters and scale
of the cropping window, which would yield the targeted disparity range and minimize the area of cropped and
letterboxed content. Once the rescaling and cropping parameters are known, we perform resampling procedure using
spline-based and perceptually optimized resampling (anti-aliasing) kernels, which have also a very efficient
computational structure. Perceptual optimization is achieved through adjusting the cut-off frequency of the anti-aliasing
filter with the throughput of the target display.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper we propose an innovative approach to tackle the problem of traffic sign detection using a computer vision
algorithm and taking into account real-time operation constraints, trying to establish intelligent strategies to simplify as
much as possible the algorithm complexity and to speed up the process. Firstly, a set of candidates is generated
according to a color segmentation stage, followed by a region analysis strategy, where spatial characteristic of previously
detected objects are taken into account. Finally, temporal coherence is introduced by means of a tracking scheme,
performed using a Kalman filter for each potential candidate. Taking into consideration time constraints, efficiency is
achieved two-fold: on the one side, a multi-resolution strategy is adopted for segmentation, where global operation will
be applied only to low-resolution images, increasing the resolution to the maximum only when a potential road sign is
being tracked. On the other side, we take advantage of the expected spacing between traffic signs. Namely, the tracking
of objects of interest allows to generate inhibition areas, which are those ones where no new traffic signs are expected to
appear due to the existence of a TS in the neighborhood. The proposed solution has been tested with real sequences in
both urban areas and highways, and proved to achieve higher computational efficiency, especially as a result of the
multi-resolution approach.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We describe an approach that oers an almost real time image enhancement through turbulent and wavy me-
dia. The approach consists in a combination of optimization-based adaptive optics with digital multi-frame
post-processing. Applications in astronomical and terrestrial imaging { where the image features are initially
unresolved due to loss of contrast, blur, vibrations and image wander { have been illustrated by experimental
results. A new software from Flexible Optical BV is presented
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents a machine vision system for real-time computation of distance and angle of a camera from reference
points in the environment. Image pre-processing, component labeling and feature extraction modules were modeled at
Register Transfer (RT) level and synthesized for implementation on field programmable gate arrays (FPGA). The
extracted image component features were sent from the hardware modules to a soft-core processor, MicroBlaze, for
computation of distance and angle. A CMOS imaging sensor operating at a clock frequency of 27MHz was used in our
experiments to produce a video stream at the rate of 75 frames per second. Image component labeling and feature
extraction modules were running in parallel having a total latency of 13ms. The MicroBlaze was interfaced with the
component labeling and feature extraction modules through Fast Simplex Link (FSL). The latency for computing
distance and angle of camera from the reference points was measured to be 2ms on the MicroBlaze, running at 100 MHz
clock frequency. In this paper, we present the performance analysis, device utilization and power consumption for the
designed system. The FPGA based machine vision system that we propose has high frame speed, low latency and a
power consumption that is much lower compared to commercially available smart camera solutions.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.