In this paper, we introduce dynamic and scalable Synthetic Aperture Radar (SAR) terrain classification based on the
Collective Network of Binary Classifiers (CNBC). The CNBC framework is primarily adapted to maximize the SAR
classification accuracy on dynamically varying databases where variations do occur in any time in terms of (new)
images, classes, features and users' relevance feedback. Whenever a "change" occurs, the CNBC dynamically and
"optimally" adapts itself to the change by means of its topology and the underlying evolutionary method MD PSO.
Thanks to its "Divide and Conquer" type approach, the CNBC can also support varying and large set of (PolSAR)
features among which it optimally selects, weighs and fuses the most discriminative ones for a particular class. Each
SAR terrain class is discriminated by a dedicated Network of Binary Classifiers (NBC), which encapsulates a set of
evolutionary Binary Classifiers (BCs) discriminating the class with a distinctive feature set. Moreover, with each
incremental evolution session, new classes/features can be introduced which signals the CNBC to create new
corresponding NBCs and BCs within to adapt and scale dynamically to the change. This can in turn be a significant
advantage when the current CNBC is used to classify multiple SAR images with similar terrain classes since no or only
minimal (incremental) evolution sessions are needed to adapt it to a new classification problem while using the
previously acquired knowledge. We demonstrate our proposed classification approach over several medium and highresolution
NASA/JPL AIRSAR images applying various polarimetric decompositions. We evaluate and compare the
computational complexity and classification accuracy against static Neural Network classifiers. As CNBC classification
accuracy can compete and even surpass them, the computational complexity of CNBC is significantly lower as the
CNBC body supports high parallelization making it applicable to grid/cloud computing.
This paper presents a hypothesis that stereoscopic perception requires a short adjustment period after a scene change
before it is fully effective. A compression method based on this hypothesis is proposed - instead of coding pictures from
the left and right views conventionally, a view in the middle of the left and right view is coded for a limited period after a
scene change. The coded middle view can be utilized in two alternative ways in rendering. First, it can be rendered as
such, which causes an abrupt change from conventional monoscopic video to stereoscopic video. Second, the layered
depth video (LDV) coding scheme can be used to associate depth, background texture, and background depth to the
middle view, enabling view synthesis and gradual view disparity increase in rendering. Subjective experiments were
conducted to evaluate and validate the presented hypothesis and compare the two rendering methods. The results indicate
that when the maximum disparity between the left and right views was relatively small, the presented time-variable
camera separation method was imperceptible. A compression gain, the magnitude of which depended on the scene
duration, was achieved with half of the sequences having a suitable disparity for the presented coding method.
In this paper, we propose a novel algorithm for constructing an Unequal Error Protection (UEP) FEC code
targeted towards video streaming applications. A concatenation of a set of parallel outer block codes followed
by a packet interleaver and an inner block code is presented. The algorithm calculates on the fly the optimal
allocation of the code rates of the inner and outer codes. When applied to video streaming applications using
H.264, the discussed UEP framework achieves gains of up to 5dB in video quality compared to equal error
protection (EEP) FEC at the same code rate.
Block effect is one of the most annoying artifacts in digital video processing and is especially visible in low-bitrate
applications, such as mobile video. To alleviate this problem, we propose an adaptive quantization method for inter
frames that can reduce visible block effect in DCT-based video coding. In the proposed method, a set of quantization
matrices are constructed before processing the video data. Matrices are constructed by exploiting the temporal frequency
limitations of human visual system. The method is adaptive to motion information and is able to select an appropriate
quantization matrix for each inter-coded block. Based on the experimental results, the proposed scheme can achieve
better subjective video quality compared to conventional flat quantization especially at low-bitrate application.
Moreover, it does not introduce extra computational cost in software implementation. This method does not change
standard bitstream syntax, so it can be directly applied to many DCT-based video codecs. A potential application could
be for mobile phone and other digital devices with low-bitrate requirement.
Low complexity video coding schemes are aimed to provide video encoding services also for devices with restricted
computational power. Video coding process based on the three-dimensional discrete cosine transform (3D DCT)
can offer a low complexity video encoder by omitting the computationally demanding motion estimation operation.
In this coding scheme, extended fast transform is also used, instead of the motion estimation, to decorrelate
the temporal dimension of video data. Typically, the most complex part of the 3D DCT based coding process
is the three-dimensional transform. In this paper, we demonstrate methods that can be used in lossy coding
process to reduce the number of one-dimensional transforms required to complete the full 3D DCT or its inverse
operation. Because unnecessary computations can be omitted, fewer operations are required to complete the
transform. Results include the obtained computational savings for standard video test sequences. The savings
are reported in terms of computational operations. Generally, the reduced number of computational operations
also implies longer battery lifetime for portable devices.
In this paper we propose a generic framework for efficient retrieval of audiovisual media based on its audio content. This framework is implemented in a client-server architecture where the client application is developed in Java to be platform independent whereas the server application is implemented for the PC platform. The client application adapts to the characteristics of the mobile device where it runs such as screen size and commands. The entire framework is designed to take advantage of the high-level segmentation and classification of audio content to improve speed and accuracy of audio-based media retrieval. Therefore, the primary objective of this framework is to provide an adaptive basis for performing efficient video retrieval operations based on the audio content and types (i.e. speech, music, fuzzy and silence). Experimental results approve that such an audio based video retrieval scheme can be used from mobile devices to search and retrieve video clips efficiently over wireless networks.
In this paper, we propose an image coding scheme using adaptive resizing algorithm to obtain more compact coefficient representation in the block-DCT domain. Standard coding systems, e.g. JPEG baseline, utilize the block-DCT transform to reduce spatial correlation and to represent the image information with a small number of visually significant transform coefficients. Because the neighboring coefficient blocks may include only a few low-frequency coefficients, we can use downsizing operation to combine the information of two neighboring blocks into a single block.
Fast and elegant image resizing methods operating in transform domain have been introduced previously. In this paper, we introduce a way to use these algorithms to reduce the number of coefficient blocks that need to be encoded. At the encoder, the downsizing operation should be performed delicately to gain compression efficiency. The information of neighboring blocks can be efficiently combined if the blocks do not contain significant highfrequency components and if the blocks share similar characteristics. Based on our experiments, the proposed method can offer from 0 to 4 dB PSNR gain for block-DCT based coding processes. Best performance can be expected for large images containing smooth homogenous areas.
It is well-known that the problem of addressing heterogeneous networks in multicast can be solved by simultaneous transmission of multiple bitstreams of different bitrates and by layered encoding. This paper analyzes the use of H.264/AVC video coding in simulcast and for layered encoding. The sub-sequence feature of H.264/AVC enables hierarchical temporal scalability, which allows disposal of reference pictures from a coded bitstream without affecting the decoding of the remaining stream. In this paper we extend the scope of the H.264/AVC sub-sequence coding technique to quality scalability. The resulting quality scalable coding technique is similar to conventional coarse-granularity quality scalability but fully compatible with the H.264/AVC standard. It is found that the proposed method drops bitrate consumption in the core network compared to simulcast up to 20%. However, the bitrate required for
enhanced-quality reception for scalably coded bitstreams is considerably higher than that of non-scalable bitstreams.
This paper investigates the transmission of H.264 /AVC video in the 3GPP Multimedia Broadcast/Multicast Streaming service (MBMS). Application-layer forward error correction (FEC) codes are used to combat transmission errors in the radio access network. In this FEC protection scheme, the media RTP stream is organized into source blocks spanning many RTP packets, over which FEC repair packets are generated. This paper proposes a novel method for unequal error
protection that is applicable in MBMS. The method reduces the expected tune-in delay when a new user joins into a broadcast. It is based on four steps. First, temporally scalable H.264 /AVC streams are coded including reference and non-reference pictures or sub-sequences. Second, the constituent pictures of a group of pictures (GOP) are grouped according to their temporal scalability layer. Third, the interleaved packetization mode of RFC3984 is used to transmit the groups in ascending order of relevance for decoding. As an example, the non-reference pictures of a GOP are sent earlier than the reference pictures of the GOP. Fourth, each group is considered a source block for FEC coding and the strength of the FEC is selected according to its importance. Simulations show that the proposed method improves the quality of the received video stream and decreases the expected tune-in delay.
This paper presents a region-based segmentation method extracting automatically moving objects from video sequences. Non-moving objects can also be segmented by using a graphical user interface. The segmentation scheme is inspired from existing methods based on the watershed algorithm. The over-segmented regions resulting from the watershed are first organized in a binary partition tree according to a similarity criterion. This tree aims to determine the fusion order. Every region is then fused with the most similar neighbour according to a spatio-temporal criterion regarding the region colors and the temporal colors continuity. The fusion can be stopped either by fixing a priori the final number of regions, or by markers given through the graphical user interface. Markers are also used to assign a class to non-moving objects. Classification of moving objects is automatically obtained by computing the Change Detection Mask. To get a better accuracy on the contours of the segmented objects, we perform a simple post-processing filter to refine the edges between different video object planes.
Content-based image retrieval possesses a tremendous potential for exploration and utilization equally for researchers and people in industry due to its promising results. Expeditious retrieval of desired images requires indexing of the content in large-scale databases along with extraction of low-level features based on the content of images contained in these databases. With the advancement in wireless communication technology and availability of multimedia capable phones it has become vital to query image databases and retrieve results based on the content of query. Our implemented System, “Mobile MUVIS”, based on contemporary MUVIS, aims to bring the capability of content-based query to any device supporting Java platform. It consists of a light-weight client application running on a Java enabled phone and a server containing a servlet running inside a web server. The server responds to image query using efficient native code from selected MUVIS database. The client application, running on mobile phone, is able to form query request which is parsed by servlet for finding closest match to the queried image. The query response is retrieved over GPRS/HSCSD network and images are displayed on the mobile phone. We are able to conclude that such system is feasible but with limited results due to resource constraint on hand-held devices and reduced network bandwidth available in mobile environments.
In this paper we present a novel approach to shape similarity estimation based on ordinal correlation. The proposed method operates in three steps: object alignment, contour to multilevel image transformation and similarity evaluation. This approach is suitable for use in CBIR, shape classification and performance evaluation of segmentation algorithms. The proposed technique produced encouraging results when applied on the MPEG-7 test data.
In this paper, we present a novel approach for describing and estimating similarity of shapes. The target application is content-based indexing and retrieval over large image databases. The shape feature vector is based on the efficient indexing of high curvature (HCP) points which are detected at different levels of resolution of the wavelet transform modulus maxima decomposition. The scale information, together with other topological information of those high curvature points are employed in a sophisticated similarity algorithm. The experimental results and comparisons show that the technique isolates efficiently similar shapes from a large database and reflects adequately the human similarity perception. The proposed algorithm also proved efficient in matching heavily occluded contours with their originals and with other shape contours in the database containing similar portions.
In this paper we present a technique for shape similarity estimation for content-based indexing and retrieval over large image databases. Here the high curvature points are detected using wavelet decomposition. The feature set is extracted under the framework of polygonal approximation. It uses simple features extracted at high curvature points. The experimental result and comparisons show the performance of the proposed technique. This technique is also suitable to be extended to the retrieval of 3D objects.
New class of nonlinear filters called Vector Median Rational Hybrid Filters (VMRHF) for multispectral image processing was introduced and applied to color image filtering problem. These filters are based on Rational Functions (RF). There are several advantages to the use of this function. First, it is a universal approximator and a good extrapolator. Second, it can be trained by a linear adaptive algorithm. Third, it has a best approximation for a specified function. The output is the result of vector rational operation taking into account three sub-functions, such as two vector median (VM) sub- filters and one center weighted vector median filter (CWVMF). It was shown that every sub-function will preserve details within its sub-window. These filters exhibit desirable properties, such as, edge and details preservation and accurate chromaticity estimation. The performance of the proposed filter is compared against widely known nonlinear filters for multispectral image processing such as: Vector median filters (VMF) introduced by Astola et al, which are derived as maximum likelihood (ML) estimates from exponential distributions, the class of directional-distance filters (DDF) introduced to study the processing of color image data using directional information. Experimental and comparative results in color image filtering show very good performance measures when the error is measured in the L*a*b* space. L*a*b* is know as a space where equal color differences result in equal distances, and therefore, it is close to the human perception of colors.
Until recently, collections of digital images were stored in classical databases and indexed by keywords entered by a human operator. This is not longer practical, due to the growing size of these collections. Moreover, the keywords associated with an image are either selected from a fixed set of words and thus cannot cover the content of all images; or they are the operators' personal description of each image and, therefore, are subjective. That is why systems for image indexing based on their content are needed. In this context, we propose in this paper a new system, MUVIS*, for content-based indexing and retrieval for image database management systems. MUVIS*indexes by key words, and also allows indexing of objects and images based on color, texture, shape and objects' layout inside them. Due to the use of large vector features, we adopted the pyramid trees are used for creating the index structure. The block diagram of the system is presented and the functionality of each block is explained. The features used are presented as well.
Rational filters are extended to multichannel signal processing and applied to the image interpolation problem. The proposed nonlinear interpolator exhibits desirable properties, such as, edge and details preservation. In this approach the pixels of the color image are considered as 3-component vectors in the color space. Therefore, the inherent correlation which exists between the different color components is not ignored; thus, leading to better image quality than those obtained by component-wise processing. Simulations show that the resulting edges obtained using vector rational filters (VRF) are free from blockiness and jaggedness, which are usually present in images interpolated using especially linear, but also some nonlinear techniques, e.g. vector median hybrid filters (VFMH).
Watershed transformation is a tool for image segmentation widely used in computer vision applications. However, the complexity of processing large images entails fast parallel algorithms. In this paper, an improved SPMD (single program multiple data) watershed algorithm based on image integration and sequential scannings is rendered. The task performed by the algorithm is an alternative to the classical simulated immersion for computing the watershed image. Although the technique converges slow on a single processor computer, due to the repeated raster and anti-raster scannings of the image, it performs much faster in parallel, when subimages of the global image are simultaneously processed. Additionally, a global connected components operation employed for the parallel labeling of the seeds for region growing increases the efficiency of the algorithm. Results of a message passing interface (MPI) implementation tested on a Cray T3D parallel computer demonstrate the resilience of the presented parallel design solution to increasing number of processors, as to larger image sizes.
Normalized cross-correlation is a standard way of looking for examples of a target in an image. A way to avoid scanning the entire image is explored. It exploits the observation that in the locality of a target object, the watershed of a smoothed image is similar to the watershed of a smoothed image of the target. Therefore, it is only necessary to align the target watershed with the image watershed and track the one along the other. This can be faster than scanning the entire image.
New criteria for shape preservation are presented. These criteria are applied in optimizing soft morphological filters. The filters are optimized by simulated annealing and genetic algorithms which are briefly reviewed. A situation where the given criteria give better results compared to the traditional MAE and MSE criteria is illustrated.
A method of based on genetic algorithms for finding an optima soft morphological filter for specific situation is presented. The behavior of soft morphological filters is illustrated by empirical results. Some optimal parameter will also be proposed for MAE and MSE error criteria.
Given a set of cost coefficients, obtained from a "representative" training data set and some desired set, we have
previously shown that the optimal Boolean filter, based on a defined error criterion, is obtained by simple compare/assign
operations. If, on the other hand, the desired solution is a stack filter, three steps must be added to the above procedure.
Following the compare/assign step above, we check if the resulting solution is of the desired type. If not, we compute
the maximal positive Boolean function contained in the resulting Boolean function. Finally, we check if adding other
minterms to the positive Boolean function obtained in the previous step will improve the criterion value.
The first step requires a very low computational effort. For the following three steps, matrix based procedures using
the stacking matrix, are derived. First, we derive a fast procedure for checking the positivity of a Boolean function.
This procedure can be written in a single line using Matlab® language. The following step consists of finding the
maximal positive Boolean function embedded in a given Boolean function. Again, a fast procedure is derived for this
task, which can also be written in one line using Matlab® language. The final step checks for improvement, in the cost
criterion, when adding other minterms to the positive Boolean function, resulting from the previous step. We will use
again the stacking matrix to accomplish this task, resulting in a three-line Matlab®code. Some examples are provided
to illustrate each step in the above procedure.
There is a finite number of different weighted order statistic (WOS) filters of a fixed length N. However, even for relatively small values of N, one cannot immediately see if two given WOS filters are the same by simply looking at the weights and the thresholds. This problem is addressed in this paper. We define two WOS filters to be equivalent (the same) if they produce the same output for arbitrary inputs. We shall show that the solution requires the use of integer linear programming and next develop a hierarchical heuristical procedure which may provide a much quicker solution to the given problem. The hierarchy starts with simple checks and proceeds to more and more complicated tests. The procedure is exited as soon as a definite conclusion is reached.
In this paper, we propose to use a class of nonlinear filters, called weighted median filters, as preprocessors to JPEG-based image coding. A theory to design weighted median filters which maximally smooth the image subject to a set of important image details is presented. An efficient algorithm is provided and simulations are given to demonstrate the performance of these preprocessors.
Watershed transformation is used in morphological image segmentation. This transformation could be considered as a topographic region growing method. Recently, fast watershed algorithms have been proposed for general purpose computers. They are based on immersion simulations of the image surface, which is considered as a topographic relief. In such a model, the greylevel values of pixels stand for altitude values on the relief. In this paper, the operation of the present fast watershed algorithms is analyzed and a new extension is proposed. Drawbacks of the present algorithms are pointed out, studied, and illustrated with test images. These problems lead, in several cases, to a loss of information about image details and structures or even to unprocessed areas in the image. The new watershed algorithm overcomes these deficiencies and preserves more information about image details. The new algorithm is based on a split-and-merge scheme. It constantly monitors the presence of isolated areas during the immersion simulation, considering them as new catchment basins. Application of the split-and-merge watershed algorithm to marker-based image segmentation is discussed.
Multistage data-dependent center-weighted-median (CWM) filters, which contain a cascade of several adaptive CWM filters based on local statistics, are presented in this paper for image restoration. Window shapes on different stages, which are oriented in different important correlation directions of images, can be different. The selection of a window shape on each stage provides additional flexibility for the design of filters in order to improve the filtering performance. One important merit of the method is to remove noise in all regions of images, including detail regions, but still preserve the details. The performance of the proposed filters is better than the corresponding 2D single-stage filters with the same window size in many cases. Computer simulations are provided to asses the performance of the proposed filters.
This paper analyzes the properties of some layered structures formed by cascading layers of Boolean and stack filters and solves the optimal design problem using techniques developed under a training framework. We propose a multilayer filtering architecture, where each layer represents a Boolean or a stack filter and the outputs of the intermediate filtering layers provide some partial solutions for the optimization problem while the final solution is provided by the last layer output. The approach to the optimal design is based on a training framework. Simulations are provided to show the effectiveness of the proposed algorithms in image restoration applications.
In this paper, we extend the concept of the Mi's of weighted median filters to stack filters. A fast algorithm is proposed to compute Mi of stack filters. The problem of synthesis of stack filters by rank selection probabilities through the Mi's is addressed. The necessary and sufficient condition for Mi to be a stack filter is presented. A procedure is proposed to synthesize a stack filter by a given set of rank selection probabilities.
In this paper, we develop a theory to design weighted order statistic filters with structural approach and discuss their applications in image filtering. By introducing a set of parameters, called Mis, the statistical properties of weighted order statistic filters are analyzed. A theorem is presented to show that any symmetric weighted order statistic filter will drive the input to a root or oscillate in a cycle of period 2. This result was proven to hold only for some weighted order statistic filters. A condition is provided to guarantee the convergence of weighted order statistic filters.
Design of optimal generalized stack filters (GSFs) under the mean absolute error (MAE) criterion suffers from two bottlenecks, that is, the design procedure depends on the joint statistics of the signal and noise processes that are rarely known and calls for a huge linear program (LP). In this paper we suggest efficient approaches to solve these problems. First, we present a method of estimate, based on training sequences, all the probabilities needed during the filter design procedure. Then, we introduce an algorithm that only involves data comparisons exclusively, but results in optimal filters in most practical cases. Design examples for image restoration from impulsive noise are provided.
A new optimization theory for stack filters is presented in this paper. This new theory is based on the minimax error criterion rather than the mean absolute error (MAE) criterion used in .
In the binary case, a methodology will be designed to find the stack filter that minimizes the maximum absolute error between the input and the output signals. The most interesting feature of this optimization procedure is the fact that it can be solved using a linear program (LP), just like in the MAE case .
One drawback of this procedure is the problem of randomization due to the lost of structure in the constraint matrix of the LP. Several sub-optimal solutions will be discussed and an algorithm to find an optimal integer solution (still using a LP) under certain conditions will be provided.
When generalizing to multiple-level inputs, complexity problems will arise and two alternatives will be suggested.
One of these approaches assumes a parameterized stochastic model for the noise process and the LP is to pick the stack filter which minimizes the worst effect of the noise on the input signal.