Metaclassification ensemble approach is known to improve the prediction performance of snow-covered area. The methodology adopted in this case is based on neural network along with four state-of-art machine learning algorithms: support vector machine, artificial neural networks, spectral angle mapper, K-mean clustering, and a snow index: normalized difference snow index. An AdaBoost ensemble algorithm related to decision tree for snow-cover mapping is also proposed. According to available literature, these methods have been rarely used for snow-cover mapping. Employing the above techniques, a study was conducted for Raktavarn and Chaturangi Bamak glaciers, Uttarakhand, Himalaya using multispectral Landsat 7 ETM+ (enhanced thematic mapper) image. The study also compares the results with those obtained from statistical combination methods (majority rule and belief functions) and accuracies of individual classifiers. Accuracy assessment is performed by computing the quantity and allocation disagreement, analyzing statistic measures (accuracy, precision, specificity, AUC, and sensitivity) and receiver operating characteristic curves. A total of 225 combinations of parameters for individual classifiers were trained and tested on the dataset and results were compared with the proposed approach. It was observed that the proposed methodology produced the highest classification accuracy (95.21%), close to (94.01%) that was produced by the proposed AdaBoost ensemble algorithm. From the sets of observations, it was concluded that the ensemble of classifiers produced better results compared to individual classifiers.
KEYWORDS: Field programmable gate arrays, Image processing, Video processing, Video surveillance, Intelligence systems, Image quality, Algorithm development, Digital signal processing
Weather degradation such as haze, fog, mist, etc. severely reduces the effective range of visual surveillance. This degradation is a spatially varying phenomena, which makes this problem non trivial. Dehazing is an essential preprocessing stage in applications such as long range imaging, border security, intelligent transportation system, etc. However, these applications require low latency of the preprocessing block. In this work, single image dark channel prior algorithm is modified and implemented for fast processing with comparable visual quality of the restored image/video. Although conventional single image dark channel prior algorithm is computationally expensive, it yields impressive results. Moreover, a two stage image dehazing architecture is introduced, wherein, dark channel and airlight are estimated in the first stage. Whereas, transmission map and intensity restoration are computed in the next stages. The algorithm is implemented using Xilinx Vivado software and validated by using Xilinx zc702 development board, which contains an Artix7 equivalent Field Programmable Gate Array (FPGA) and ARM Cortex A9 dual core processor. Additionally, high definition multimedia interface (HDMI) has been incorporated for video feed and display purposes. The results show that the dehazing algorithm attains 29 frames per second for the image resolution of 1920x1080 which is suitable of real time applications. The design utilizes 9 18K_BRAM, 97 DSP_48, 6508 FFs and 8159 LUTs.
Monitoring agricultural areas is still a very challenging task. Various models and methodologies have been developed for monitoring the agricultural areas with satellite images, but their practical applicability is limited due to the complexity in processing and dependence on a priori information. Therefore, in this paper, an attempt has been made to investigate the utility of the Kanade–Lucas–Tomasi (KLT) tracker, which is generally useful for tracking objects in video images, for monitoring agricultural areas. The KLT tracker was proposed to deal with the problem of image registration, but the use of the KLT tracker in satellite images for land cover monitoring is rarely reported. Advanced Land Observing Satellite Phased Array type L-band Synthetic Aperture Radar (ALOS PALSAR) data has been used to identify and track the agricultural areas. The tracked pixels were compared with the agriculture pixels obtained from a decision tree algorithm and both results are closely matched. An image differencing change detection technique has been applied after KLT tracker implementation to observe the “change” and “no change” pixels in agricultural areas. It is observed that two kinds of changes are being detected. The areas where agriculture was not there earlier, but now is present, the changes are called positive changes. In the areas where agriculture was present earlier, but now is not present, those changes are referred to as negative changes. Unchanged areas retrieved from both the images are labeled as “no change” pixels. The novelty of the proposed algorithm is that it uses a simplified version of the KLT tracker to efficiently select and track the agriculture features on the basis of their spatial information and does not require a priori information every time.
In this paper, a multispectral active stereo vision system is developed for tracking the motion of moving objects
in different environment. The development of such a multispectral surveillance system composed by combining
visible(color) and infrared sensors. The aim is to behind proposing such a network of visible and thermal sensors
is to give an optimal performance in various weather conditions (foggy, snowing, dark and rainy) and increase the
detection performance. The detection of moving objects is performed by mean of the optical flow between the
images of two different spectrum. The optical flow has been computed by formulating an energy minimization
problem and subsequently solving it by a numerical optimization algorithm. The convergence of the numerical
scheme is given in case of different images. Finally, a number of experimental results (optical flow from stereo
images) are given to prove the applicability of such a system in various applications and situations where a robust
surveillance and tracking system is needed.
In this paper, a new algorithm meant for object tracking application is proposed using local extrema patterns (LEP) and color features. The standard local binary pattern (LBP) encodes the relationship between reference pixel and its surrounding neighbors by comparing gray level values. The proposed method differs from the existing LBP in a manner that it extracts the edge information based on local extrema between center pixel and its neighbors in an image. Further, the joint histogram between RGB color channels and LEP patterns has been build which is used as a feature vector in object tracking. The performance of the proposed method is compared with Ning et al. on three benchmark video sequences. The results after being investigated proposed method show a significant improvement in object tracking application as compared to Ning et al.
In this paper, a novel support vector machine (SVM) tree is proposed for gesture recognition from the silhouette
images. A skeleton based strategy is adopted to extract the features from a video sequence representing any
human gesture. In our binary tree implementation of SVM, the number of binary classifiers required is reduced
since, instead of grouping different classes together in order to train a global classifier, we select two classes for
training at every node of the tree and use probability theory to classify the remaining points based on their
similarities and differences to the two classes used for training. This process is carried on, randomly selecting
two classes for training at a node, thus creating two child nodes and subsequently assigning the classes to the
nodes derived. In the classification phase, we start out at the root node. At each node of the tree, a binary
decision is made regarding the assignment of the input data point to either of the group represented by the left
and right sub-tree of the node which may contain multiple classes. This is repeated recursively downward until
we reach a leaf node that represents the class to which the input data point belonging. Finally, the proposed
framework is tested on various data sets to check its efficiency. Encouraging results are achieved in terms of
classification accuracy.
In this paper, a linear discriminant analysis (LDA) based classifier employed in a tree structure is presented to
recognize the human actions in a wide and complex environment. In particular, the proposed classifier is based
on a supervised learning process and achieves the required classification in a multi-step process. This multi-step
process is performed simply by adopting a tree structured which is built during the training phase. Hence, there
is no need of any priori information like in other classifiers such as the number of hidden neurons or hidden
layers in a multilayer neural network based classifier or an exhaustive search as used in training algorithms
for decision trees. A skeleton based strategy is adopted to extract the features from a given video sequence
representing any human action. A Pan-Tilt-Zoom (PTZ) camera is used to monitor the wide and complex test
environment. A background mosaic image is built offline and used to compute the background images in real
time. A background subtraction strategy has been adopted for detecting the object in various frames and to
extract their corresponding silhouette. A skeleton based process is used to extract attributes of a feature vector
corresponding to a human action. Finally, the proposed framework is tested on various indoor and outdoor
scenarios and encouraging results are achieved in terms of classification accuracy.
In this paper, a robust watermarking technique based on fractional cosine transform and singular value decomposition
is presented to improve the protection of the images. A meaningful gray scale image is used as watermark
instead of randomly generated Gaussian noise type watermark. First, host image is transformed by the means
of fractional cosine transform. Now, the positions of all frequency coefficients are changed with respect to some
rule and this rule is secret and only known to the owner/creator. Then inverse fractional cosine transform is
performed to get the reference image. Watermark logo is embedded in the reference image by modifying its
singular values. For embedding, the singular values of the reference image are found and then modify it by
adding the singular values of the watermark image. A reliable watermark extraction algorithm is developed for
extracting watermark from possibly attacked image. The experimental results show better visual imperceptibility
and resiliency of the proposed scheme against intentional or un-intentional variety of attacks.
KEYWORDS: Digital watermarking, Image compression, Image processing, Digital imaging, 3D image processing, 3D vision, Image transmission, Image filtering, Digital filtering, Information security
We present a robust stereo-image coding algorithm using digital watermarking in fractional Fourier transform (FrFT) and singular value decomposition (SVD). For the purpose of the security, the original (left stereo) image has been degraded and watermark (right disparity map) is embedded in the degraded image. This watermarked degraded stereo image is processed in an insecure channel. At the receiver's end, both the watermarked image (left stereo image) and watermark images are found by the decoding process. The use of the FrFT, SVD, and degradation process of the stereo image add much more complexity to decode the information about the stereo images and disparity map extraction. Moreover, processing of the watermarked image only provides the stereo as well as 3-D information of the scene/object. Experimental results show that the proposed algorithm is efficient to achieve stereo image security.
In this paper, a multipurpose watermarking scheme is proposed. The meaning of the word multipurpose is to make the proposed scheme as single watermarking scheme (SWS) or multiple watermarking scheme (MWS)
according to our requirement and convenience. We first segment the host image into blocks by means of Hilbert space filling curve and based on amount of DCT energy in the blocks, the threshold values are selected which make proposed scheme multipurpose. For embedding of n watermarks (n - 1) thresholds are selected. If the
amount of DCT energy of the block is less than the threshold value then ENOPV decomposition is performed and watermark is embedded in either low or high or all frequency sub-bands by modifying the singular values. If the amount of DCT energy of the block is greater than the threshold value then embedding is done by modifying
the singular values. This process of embedding through ENOPV-SVD and SVD is applied alternatively to all (n - 1) threshold values. Finally, modified blocks are mapped back to their original positions using inverse Hilbert space filling curve to get the watermarked image. A reliable extraction process is developed for extracting all
watermarks from attacked image. Experiments are done on different standard gray scale images and robustness is carried out by a variety of attacks.
KEYWORDS: Reflectivity, Monte Carlo methods, Edge detection, 3D image processing, Machine vision, Computer vision technology, 3D modeling, Image segmentation, Light sources, Visual process modeling
There are many objects in the real world, especially, man made objects often having a polyhedral shape. Shape
from shading (SFS) is a well known and the most robust technique of Computer vision. SFS is a first order
nonlinear, ill-posed problem. The main idea for solving ill-posed problems is to restrict the class of admissible
solution by introducing suitable a priori knowledge. To overcome the ill-posedness in SFS techniques, Bayesian
estimation of geometrical constraints are used. The Lambertian reflectance model is used in this method due to
its wide applicability in SFS techniques. The priori or the constraints are represented in the form of probability
distribution function, so that the Bayesian approach can be applied. The Monte Carlo method is applied
for generating the sample fields from the distribution so that the model can represent our priori knowledge and
constraints. The optimal estimators are also computed by using Monte Carlo method. The geometric constraints
for lines and planes are used in probabilistic manner to eliminate the rank deficiency to get the unique solution.
In case of incorrect line drawings, it is not always possible to reconstruct the object shape uniquely. To deal with
this problem, we have processed each planar face separately. Hence, the proposed method is applicable in case
of slight error in computation of vertex positions in the images of polyhedral objects. The proposed method is
used on various synthetic and real images and satisfactory results are obtained.
For stereo imaging, it is a general practice to use two cameras of same focal lengths, with their viewing axis
normal to the line joining the camera centres. This paper analyses the result of difference in orientations and
focal lengths of two arbitrary prespective viewing cameras, by deriving the epipolar lines and its correspoinding
equations. This enables one to find the correspondence search space in terms of focal length accuracies as well
as camera orientation parameteres. Relevant numerically simulated results are also given.
The shapes of many natural and man-made objects have planar and curvilinear surfaces. The images of such curves usually do not
have sufficient distinctive features to apply conventional feature-based reconstruction algorithms. In this paper, we describe a method of reconstruction of a quadratic curve in 3-D space as an intersection of two cones containing the respective projected curve images. The correspondence between this pair of projections of the curve is assumed to be established in this work. Using least-square curve fitting, the parameters of a curve in 2-D space are found. From this we are reconstructing the 3-D quadratic curve. Relevant mathematical formulations and analytical solutions for obtaining the equation of reconstructed curve are given. The result of the described reconstruction methodology are studied by simulation studies. This reconstruction methodology is applicable to LBW decision in cricket, path of the missile, Robotic Vision, path lanning etc.
The process of reconstruction of a parabola in 3-D space from a pair of arbitrary perspective views obtains the set of parameters which represent the parabola. This method is widely used in many applications of 3-D object recognition, machine inspection and trajectory tracing. However in certain applications which require a large degree of accuracy, a study of errors in the process of reconstruction, with the help of a rigorous performance analysis is necessary. In this paper, the reconstruction of a 3D parabola from two perspective projections is described. In this process, the two end points and the vertex of the two pair of projections of the parabola are considered as feature points to reconstruct the parabola in 3-D. Simulation studies have been conducted to observe the effect of noise on errors in the process of reconstruction. The performance analysis illustrating the effect of noise, loss of accuracy due to mathematical calculations and parameters of imaging setup, on errors in reconstruction are presented. The angle between the reconstructed and original parabola in 3-D space has been used as a one of the criterion for the measurement of error. Smaller resolution of the image, certain geometric conditions and imaging setup produce poor performance in reconstruction. Results of this study are useful for the design of an optimal stereo-based imaging system, for best reconstruction with minimum error.
Reconstruction of a line in 3-D space using arbitrary perspective views involves the problem of obtaining the set of parameters representing the line. This is widely used for many applications of 3-D object recognition and machine inspection. A performance analysis of the reconstruction process in the presence of noise in the image planes is necessary in certain applications which require a large degree of accuracy. In this paper, a methodology, which is based on the concept of epipolar line, for the reconstruction of a 3-D line, from two arbitrary perspective views is given. In this problem the points in the second image plane, which correspond to points in the first image plane are found by using epipolar line method, by considering all the points in the first image plane. Then triangulation law is used to find the points in 3-D space. Using least square regression in 3-D, the parameters of a line in 3-D space are found. This least square regression problem is solved by two different methods. Simulation study results of this epipolar line based method, in presence of noise, as well as results of error analysis are given.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.