PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.
This PDF file contains the front matter associated with SPIE Proceedings Volume 10443, including the Title Page, Copyright information, Table of Contents, and Conference Committee listing.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this work, we present a system for pattern recognition that combines the power of genetic algorithms for solving problems and the efficiency of the morphological associative memories. We use a set of 48 tire prints divided into 8 brands of tires. The images have dimensions of 200 x 200 pixels. We applied Hough transform to obtain lines as main features. The number of lines obtained is 449. The genetic algorithm reduces the number of features to ten suitable lines that give thus the 100% of recognition. Morphological associative memories were used as evaluation function. The selection algorithms were Tournament and Roulette wheel. For reproduction, we applied one-point, two-point and uniform crossover.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Eye movement is a new kind of feature for biometrical recognition, it has many advantages compared with other features such as fingerprint, face, and iris. It is not only a sort of static characteristics, but also a combination of brain activity and muscle behavior, which makes it effective to prevent spoofing attack. In addition, eye movements can be incorporated with faces, iris and other features recorded from the face region into multimode systems. In this paper, we do an exploring study on eye movement identification based on the eye movement datasets provided by Komogortsev et al. in 2011 with different classification methods. The time of saccade and fixation are extracted from the eye movement data as the eye movement features. Furthermore, the performance analysis was conducted on different classification methods such as the BP, RBF, ELMAN and SVM in order to provide a reference to the future research in this field.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper proposes a novel method for extracting facades structure from real-world pictures by using local geometric moment. Compared with existing methods, the proposed method has advantages of easy-to-implement, low computational cost, and robustness to noises, such as uneven illumination, shadow, and shade from other objects. Besides, our method is faster and has a lower space complexity, making it feasible for mobile devices and the situation where real-time data processing is required. Specifically, a facades structure modal is first proposed to support the use of our special noise reduction method, which is based on a self-adapt local threshold with Gaussian weighted average for image binarization processing and the feature of the facades structure. Next, we divide the picture of the building into many individual areas, each of which represents a door or a window in the picture. Subsequently we calculate the geometric moment and centroid for each individual area, for identifying those collinear ones based on the feature vectors, each of which is thereafter replaced with a line. Finally, we comprehensively analyze all the geometric moment and centroid to find out the facades structure of the building. We compare our result with other methods and especially report the result from the pictures taken in bad environmental conditions. Our system is designed for two application, i.e, the reconstruction of facades based on higher resolution ground-based on imagery, and the positional system based on recognize the urban building.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents a natural representation of numerical digit(s) using hand activity analysis based on number of fingers out stretched for each numerical digit in sequence extracted from a video. The analysis is based on determining a set of six features from a hand image. The most important features used from each frame in a video are the first fingertip from top, palm-line, palm-center, valley points between the fingers exists above the palm-line. Using this work user can convey any number of numerical digits using right or left or both the hands naturally in a video. Each numerical digit ranges from 0 to9. Hands (right/left/both) used to convey digits can be recognized accurately using the valley points and with this recognition whether the user is a right / left handed person in practice can be analyzed. In this work, first the hand(s) and face parts are detected by using YCbCr color space and face part is removed by using ellipse based method. Then, the hand(s) are analyzed to recognize the activity that represents a series of numerical digits in a video. This work uses pixel continuity algorithm using 2D coordinate geometry system and does not use regular use of calculus, contours, convex hull and datasets.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A novel machine learning method named multiclass multiple kernel learning based on support vector data description with negative (MMKL-NSVDD) is developed to classify the FFT-magnitude feature of complex high-resolution range profile (HRRP), motivated by the problem of radar automatic target recognition (RATR). The proposed method not only inherits the close nonlinear boundary advantage of SVDD-neg model, which is applied with no assumptions regarding to the distribution of data and prior information, but also incorporates multiple kernel into the mode, avoiding fussy choice of kernel parameters and fusing multiple kernel information. Hence, it leads to a remarkable improvement of recognition rate, demonstrated by experimental results based on HRRPs of four aircrafts. The MMKL-NSVDD is ideal for HRRPBased radar target recognition.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
It is a necessary step for Chinese character segmentation from degraded document images in Optical Character Recognizer (OCR); however, it is challenging due to various kinds of noising in such an image. In this paper, we present three local first-order statistics method that had been adaptive thresholding for segmenting text and non-text of Chinese rubbing image. Both visual inspection and numerically investigate for the segmentation results of rubbing image had been obtained. In experiments, it obtained better results than classical techniques in the binarization of real Chinese rubbing image and PHIBD 2012 datasets.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Providing accurate recognition of human activities is a challenging problem for visual surveillance applications. In this paper, we present a simple and efficient algorithm for human activity recognition based on a wavelet transform. We adopt discrete wavelet transform (DWT) coefficients as a feature of human objects to obtain advantages of its multiresolution approach. The proposed method is tested on multiple levels of DWT. Experiments are carried out on different standard action datasets including KTH and i3D Post. The proposed method is compared with other state-of-the-art methods in terms of different quantitative performance measures. The proposed method is found to have better recognition accuracy in comparison to the state-of-the-art methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Pattern recognition of concrete surface crack defects is very important in determining stability of structure like building, roads or bridges. Surface crack is one of the subjects in inspection, diagnosis, and maintenance as well as life prediction for the safety of the structures. Traditionally determining defects and cracks on concrete surfaces are done manually by inspection. Moreover, any internal defects on the concrete would require destructive testing for detection. The researchers created an automated surface crack detection for concrete using image processing techniques including Hough transform, LoG weighted, Dilation, Grayscale, Canny Edge Detection and Haar Wavelet Transform. An automatic surface crack detection robot is designed to capture the concrete surface by sectoring method. Surface crack classification was done with the use of Haar trained cascade object detector that uses both positive samples and negative samples which proved that it is possible to effectively identify the surface crack defects.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
While Chinese is learned as a second language, its characters are taught step by step from their strokes to components, radicals to components, and their complex relations. Chinese Characters in digital ink from non-native language writers are deformed seriously, thus the global recognition approaches are poorer. So a progressive approach from bottom to top is presented based on hierarchical models. Hierarchical information includes strokes and hierarchical components. Each Chinese character is modeled as a hierarchical tree. Strokes in one Chinese characters in digital ink are classified with Hidden Markov Models and concatenated to the stroke symbol sequence. And then the structure of components in one ink character is extracted. According to the extraction result and the stroke symbol sequence, candidate characters are traversed and scored. Finally, the recognition candidate results are listed by descending. The method of this paper is validated by testing 19815 copies of the handwriting Chinese characters written by foreign students.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Automobiles have become essential parts of our everyday lives. It can correlate many factors that may affect a vehicle primarily those which may inconvenient or in some cases harm lives or properties. Thus, focusing on detecting an automatic transmission vehicle engine, body and other parts that cause vibration and sound may help prevent car problems using MATLAB. By using sound, vibration, and temperature sensors to detect the defects of the car and with the help of the transmitter and receiver to gather data wirelessly, it is easy to install on to the vehicle. A technique utilized from Toyota Balintawak Philippines that every car is treated as panels(a, b, c, d, and e) 'a' being from the hood until the front wheel of the car and 'e' the rear shield to the back of the car, this was applied on how to properly place the sensors so that precise data could be gathered. Data gathered would be compared to the normal graph taken from the normal status or performance of a vehicle, data that would surpass 50% of the normal graph would be considered that a problem has occurred. The system is designed to prevent car accidents by determining the current status or performance of the vehicle, also keeping people away from harm.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Forest species identification is a special case of texture classification problem that can be solved with hand-crafted features. Convolutional Networks (ConvNet) is able to learn features adaptively and it has achieved impressive result in complicated recognition tasks. This paper presents an improvement to ConvNet-based approach in1 for forest species identification. Due to the small amount of training data, we proposed the addition of dropout layer to ConvNet architecture and data augmentation to increase the size of training data. New classification process of combining the ConvNet outputs of each image patches is proposed. Our improved ConvNet-based method has achieved promising results.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In finger vein image preprocessing, finger angle correction and ROI extraction are important parts of the system. In this paper, we propose an angle correction algorithm based on the centroid of the vein image, and extract the ROI region according to the bidirectional gray projection method. Inspired by the fact that features in those vein areas have similar appearance as valleys, a novel method was proposed to extract center and width of palm vein based on multi-directional gradients, which is easy-computing, quick and stable. On this basis, an encoding method was designed to determine the gray value distribution of texture image. This algorithm could effectively overcome the edge of the texture extraction error. Finally, the system was equipped with higher robustness and recognition accuracy by utilizing fuzzy threshold determination and global gray value matching algorithm. Experimental results on pairs of matched palm images show that, the proposed method has a EER with 3.21% extracts features at the speed of 27ms per image. It can be concluded that the proposed algorithm has obvious advantages in grain extraction efficiency, matching accuracy and algorithm efficiency.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
E-Learning is an important media that an educational institution must have. Successful information design for e-learning depends on its user’s characteristics. This study explores differences between novice and expert users’ eye movement data. This differences between expert and novice users were compared and identified based on gaze features. Each participant must do three main tasks of e-learning. This paper gives the result that there are differences between gaze features of experts and novices.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Tracking driver’s face is one of the essentialities for driving safety control. This kind of system is usually designed with complicated algorithms to recognize driver’s face by means of powerful computers. The design problem is not only about detecting rate but also from parts damages under rigorous environments by vibration, heat, and humidity. A feasible strategy to counteract these damages is to integrate entire system into a single chip in order to achieve minimum installation dimension, weight, power consumption, and exposure to air. Meanwhile, an extraordinary methodology is also indispensable to overcome the dilemma of low-computing capability and real-time performance on a low-end chip. In this paper, a novel driver face tracking system is proposed by employing semantics-based vague image representation (SVIR) for minimum hardware resource usages on a FPGA, and the real-time performance is also guaranteed at the same time. Our experimental results have indicated that the proposed face tracking system is viable and promising for the smart car design in the future.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The deep convolutional neural networks for face recognition, from DeepFace to the recent FaceNet, demand a sufficiently large volume of filters for feature extraction, in addition to being deep. The shallow filter-bank approaches, e.g., principal component analysis network (PCANet), binarized statistical image features (BSIF), and other analogous variants, endure the filter scarcity problem that not all PCA and ICA filters available are discriminative to abstract noise-free features. This paper extends our previous work on multi-fold filter convolution (ℳ-FFC), where the pre-learned PCA and ICA filter sets are exponentially diversified by ℳ folds to instantiate PCA, ICA, and PCA-ICA offspring. The experimental results unveil that the 2-FFC operation solves the filter scarcity state. The 2-FFC descriptors are also evidenced to be superior to that of PCANet, BSIF, and other face descriptors, in terms of rank-1 identification rate (%).
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we propose a framework for low cost secure electronic voting system based on face recognition. Essentially Local Binary Pattern (LBP) is used for face feature characterization in texture format followed by chi-square distribution is used for image classification. Two parallel systems are developed based on smart phone and web applications for face learning and verification modules. The proposed system has two tire security levels by using person ID followed by face verification. Essentially class specific threshold is associated for controlling the security level of face verification. Our system is evaluated three standard databases and one real home based database and achieve the satisfactory recognition accuracies. Consequently our propose system provides secure, hassle free voting system and less intrusive compare with other biometrics.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Face recognition has been widely studied recently while video-based face recognition still remains a challenging task because of the low quality and large intra-class variation of video captured face images. In this paper, we focus on two scenarios of video-based face recognition: 1)Still-to-Video(S2V) face recognition, i.e., querying a still face image against a gallery of video sequences; 2)Video-to-Still(V2S) face recognition, in contrast to S2V scenario. A novel method was proposed in this paper to transfer still and video face images to an Euclidean space by a carefully designed convolutional neural network, then Euclidean metrics are used to measure the distance between still and video images. Identities of still and video images that group as pairs are used as supervision. In the training stage, a joint loss function that measures the Euclidean distance between the predicted features of training pairs and expanding vectors of still images is optimized to minimize the intra-class variation while the inter-class variation is guaranteed due to the large margin of still images. Transferred features are finally learned via the designed convolutional neural network. Experiments are performed on COX face dataset. Experimental results show that our method achieves reliable performance compared with other state-of-the-art methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The rise of the Internet of Things to promote the development of technology development board, the processor speed of operation and memory capacity increases, more and more applications, can already be completed before the data on the board computing, combined with the network to sort the information after Sent to the cloud for processing, so that the front of the development board is no longer simply retrieve the data device. This study uses Asus Tinker Board to install OpenCV for real-time face recognition and capture of the face, the acquired face to the Microsoft Cognitive Service cloud database for artificial intelligence comparison, to find out what the face now represents the mood. The face of the corresponding person name, and finally, and then through the text of Speech to read the name of the name to complete the identification of the action. This study was developed using the Asus Tinker Board, which uses ARM-based CPUs with high efficiency and low power consumption, plus improvements in memory and hardware performance for the development board.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The pose issue which may cause loss of useful information has always been a bottleneck in face and ear recognition. To address this problem, we propose a multimodal recognition approach based on face and ear using local feature, which is robust to large facial pose variations in the unconstrained scene. Deep learning method is used for facial pose estimation, and the method of a well-trained Faster R-CNN is used to detect and segment the region of face and ear. Then we propose a weighted region-based recognition method to deal with the local feature. The proposed method has achieved state-of-the-art recognition performance especially when the images are affected by pose variations and random occlusion in unconstrained scene.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Digital cameras and smart-phones with orientation sensors allow auto-rotation of portrait images. Auto-rotation of portrait is done by using the image file's metadata, exchangeable image file format (EXIF). The output of these sensors is used to set the EXIF orientation flag to reflect the positioning of the camera with respect to the ground. Unfortunately, software program support for this feature is not widespread or consistently applied. Our research goal is to create the EXIF orientation flag by detecting the upright direction of face images having no orientation flag and is to apply the software of organizing photos. In this paper, we propose a novel upright detection scheme for face images that relies on generation of rotated images in four direction and part-based face detection with Haar-like features. Inputted images are frontal faces and these images are in-plain rotated in four possible direction. The process of upright detection is that among four possible rotated images, if only one rotated image is accepted in face detection and other three rotated images are rejected, the upright direction is obtained from the accepted direction. Rotation angle of EXIF orientation is, 0 degree, 90 degree clockwise, 90 degree counter-clockwise, or 180 degree. Experimental results on 450 face image samples show that proposed method is very effective in detecting upright of face images with background variations.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we have developed an algorithm to track the pose of a human face robustly and efficiently. Face pose estimation is very useful in many applications such as building virtual reality systems and creating an alternative input method for the disabled. Firstly, we have modified a face detection toolbox called DLib for the detection of a face in front of a camera. The detected face features are passed to a pose estimation method, known as the four-point algorithm, for pose computation. The theory applied and the technical problems encountered during system development are discussed in the paper. It is demonstrated that the system is able to track the pose of a face in real time using a consumer grade laptop computer.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Aiming at discovering and segmenting out common objects from multiple images, co-segmentation is a effective method. It is more accurate to make full use of the relationships between images in segmenting than only single image. The first step is to deal with single image with employing hierarchical segmentation to get a Contour Map, saliency detection to obtain the saliency map and object detection to find the possible common part. Then, constructing a digraph with the multiple local regions, and dealing with the digraph. When a digraph is constructed, the corresponding between adjacent two images is influential to the co-segmentation results. This paper develops a method to sort the images to co-segment. Also, we test the method on ICOSEG and MSRC datasets, and compare it with four proposed method. And the results show that it is efficient in co-segmentation with higher precision than many existing and conventional co-segmentation methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Carotid Artery (CA) is one of the vital organs in the human body. CA features that can be used are position, size and volume. Position feature can used to determine the preliminary initialization of the tracking. Examination of the CA features can use Ultrasound. Ultrasound imaging can be operated dependently by an skilled operator, hence there could be some differences in the images result obtained by two or more different operators. This can affect the process of determining of CA. To reduce the level of subjectivity among operators, it can determine the position of the CA automatically. In this study, the proposed method is to segment CA in B-Mode Ultrasound Image based on morphology, geometry and gradient direction. This study consists of three steps, the data collection, preprocessing and artery segmentation. The data used in this study were taken directly by the researchers and taken from the Brno university's signal processing lab database. Each data set contains 100 carotid artery B-Mode ultrasound image. Artery is modeled using ellipse with center c, major axis a and minor axis b. The proposed method has a high value on each data set, 97% (data set 1), 73 % (data set 2), 87% (data set 3). This segmentation results will then be used in the process of tracking the CA.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The baggage number needs to be checked automatically during baggage self-check-in. A fast airline baggage counting method is proposed in this paper using image segmentation based on height map which is projected by scanned baggage 3D point cloud. There is height drop in actual edge of baggage so that it can be detected by the edge detection operator. And then closed edge chains are formed from edge lines that is linked by morphological processing. Finally, the number of connected regions segmented by closed chains is taken as the baggage number. Multi-bag experiment that is performed on the condition of different placement modes proves the validity of the method.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The automated segmentation of cell nuclei is an essential stage in the quantitative image analysis of cell nuclei extracted from smear cytology images of pleural fluid. Cell nuclei can indicate cancer as the characteristics of cell nuclei are associated with cells proliferation and malignancy in term of size, shape and the stained color. Nevertheless, automatic nuclei segmentation has remained challenging due to the artifacts caused by slide preparation, nuclei heterogeneity such as the poor contrast, inconsistent stained color, the cells variation, and cells overlapping. In this paper, we proposed a watershed-based method that is capable to segment the nuclei of the variety of cells from cytology pleural fluid smear images. Firstly, the original image is preprocessed by converting into the grayscale image and enhancing by adjusting and equalizing the intensity using histogram equalization. Next, the cell nuclei are segmented using OTSU thresholding as the binary image. The undesirable artifacts are eliminated using morphological operations. Finally, the distance transform based watershed method is applied to isolate the touching and overlapping cell nuclei. The proposed method is tested with 25 Papanicolaou (Pap) stained pleural fluid images. The accuracy of our proposed method is 92%. The method is relatively simple, and the results are very promising.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Statistical data processing has been one of the most important activities in many fields of scientific studies, and has become the only way through which one can deal with the underlying processes of the given phenomenon. The two classical techniques to solar time series analysis are related to the space domain and the spectral. In the present paper, the relative phase relationship of sunspot unit area on both hemispheres is investigated by the long-range correlation and the wavelet transform analysis. It is found that, (1) the north-south asynchrony of sunspot unit area can not be regarded as a stochastic phenomenon because its behavior exhibits a long-term tendency; (2) The leading hemisphere of sunspot unit area is the southern hemisphere before the year of 1962, and then the northern hemisphere till the year of 2008; (3) the sunspot unit area should be used to represent the long-term solar magnetic activity. Our analysis results could be instructive to put further research on the physical mechanisms of north-south asynchrony of magnetic activity on the Sun. Moreover, the long-range correlation analysis and the wavelet transform technique of solar time series provide crucial information to understand, describe, and predict long-term solar variability.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Image registration is the key technology of digital imaging applications, it is used widely. We researched the image registration techniques in this paper. Based on the basis of D-Nets image registration algorithms, we propose a new innovation. We turn first to process image, so we can get synthetic images of original images and enhanced images. Then we extract SIFT feature in the original image. Next, in order to reduce noise of the image, we use the Gauss filter to process the synthesized image. Then we do experiments with synthetic images in the process of image registration. In this process, we use the D-Nets algorithm to achieve. Compared to the existing method, it can greatly improve the accuracy and recall.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The tremendous technological advancements in recent times has enabled people to create, edit and circulate images easily than ever before. As a result of this, ensuring the integrity and authenticity of the images has become challenging. Malicious editing of images to deceive the viewer is referred to as image tampering. A widely used image tampering technique is image splicing or compositing, in which regions from different images are copied and pasted. In this paper, we propose a tamper detection method utilizing the blocking and blur artifacts which are the footprints of splicing. The classification of images as tampered or not, is done based on the standard deviations of the entropy histograms and block discrete cosine transformations. We can detect the exact boundaries of the tampered area in the image, if the image is classified as tampered. Experimental results on publicly available image tampering datasets show that the proposed method outperforms the existing methods in terms of accuracy.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In previous work, a probabilistic image matching model for binary images was developed that predicts the number of mappings required to detect dissimilarity between any pair of binary images based on the amount of similarity between them. The model showed that dissimilarity can be detected quickly by randomly comparing corresponding points between two binary images. In this paper, we improve on this quickness for images that have dissimilarity concentrated near their centers. We apply smart mapping schemes to different image sets and analyze the results to show the effectiveness of this mapping scheme for images that have dissimilarity concentrated near their center. We compare three different smart mapping schemes with three different mapping densities to find the best mapping / best density performance.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The rapid growth of different types of images has posed a great challenge to the scientific fraternity. As the images are increasing everyday, it is becoming a challenging task to organize the images for efficient and easy access. The field of image retrieval attempts to solve this problem through various techniques. This paper proposes a novel technique of image retrieval by combining Scale Invariant Feature Transform (SIFT) and Co-occurrence matrix. For construction of feature vector, SIFT descriptors of gray scale images are computed and normalized using z-score normalization followed by construction of Gray-Level Co-occurrence Matrix (GLCM) of normalized SIFT keypoints. The constructed feature vector is matched with those of images in database to retrieve visually similar images. The proposed method is tested on Corel-1K dataset and the performance is measured in terms of precision and recall. The experimental results demonstrate that the proposed method outperforms some of the other state-of-the-art methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Diagnostic ultrasound (US) is an important tool in today's sophisticated medical diagnostics. Nearly every medical discipline benefits itself from this relatively inexpensive method that provides a view of the inner organs of the human body without exposing the patient to any harmful radiations. Medical diagnostic images are usually corrupted by noise during their acquisition and most of the noise is speckle noise. To solve this problem, instead of using adaptive filters which are widely used, No-Local Means based filters have been used to de-noise the images. Ultrasound images of four organs such as Abdomen, Ortho, Liver, Kidney, Brest and Prostrate of a Human body have been used and applied comparative analysis study to find out the output. These images were taken from Siemens SONOLINE G60 S System and the output was compared by matrices like SNR, RMSE, PSNR IMGQ and SSIM. The significance and compared results were shown in a tabular format.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
An action description method named as Motion History Point Cloud (MHPC) is proposed in this paper. MHPC compresses an action into a three-dimensional point cloud in which depth information is required. In MHPC, the spatial coordinate channels are used to record the motion foreground, and the color channels are used to record the temporal variation. Due to containing depth information, MHPC can depict an action more meticulous than Motion History Image (MHI). MHPC can serve as a pre-processed input for various classification methods, such as Bag of Words and Deep Learning. An action recognition scheme is provided as an application example of MHPC. In this scheme, Harris3D detector and Fast Point Feature Histogram (FPFH) are used to extract and describe features from MHPC. Then, Bag of Words and multiple classification Support Vector Machine (SVM) are used to do action recognition. The experiments show that rich features can be extracted from MHPC to support the subsequent action recognition even after downsampling. The feasibility and effectiveness of MHPC are also verified by comparing the above scheme with two similar methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Scanning infrared imaging system often suffers from stripe non-uniformity. Considering the geometric characteristic of stripe non-uniformity in scanning images, the gradient of pixels cross scanning direction is much more than that in scanning direction, and the latter is more similar to the real scene. The reason for the above phenomenon is that pixels in scanning direction have uniformity parameters and those cross scanning direction have non-uniformity parameters. Therefore, a homogenization method based on a unidirectional variation model is proposed in this paper. The unidirectional variation model can minimize the gradient cross scanning direction. And the homogenization method is used to preserve the edge and detailed information in scanning direction. Experimental results demonstrate a good performance of our proposed method for stripe non-uniformity images.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Keratoconus is a progressive cornea disease that can lead to serious myopia and astigmatism, or even to corneal transplantation, if it becomes worse. The early detection of keratoconus is extremely important to know and control its condition. In this paper, we propose an automatic diagnosis algorithm for keratoconus to discriminate the normal eyes and keratoconus ones. We select the parameters obtained by Oculyzer as the feature of cornea, which characterize the cornea both directly and indirectly. In our experiment, 289 normal cases and 128 keratoconus cases are divided into training and test sets respectively. Far better than other kernels, the linear kernel of SVM has sensitivity of 94.94% and specificity of 97.87% with all the parameters training in the model. In single parameter experiment of linear kernel, elevation with 92.03% sensitivity and 98.61% specificity and thickness with 97.28% sensitivity and 97.82% specificity showed their good classification abilities. Combining elevation and thickness of the cornea, the proposed method can reach 97.43% sensitivity and 99.19% specificity. The experiments demonstrate that the proposed automatic diagnosis method is feasible and reliable.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Analyzing ultrasound (US) images to get the shapes and structures of particular anatomical regions is an interesting field of study since US imaging is a non-invasive method to capture internal structures of a human body. However, bone segmentation of US images is still challenging because it is strongly influenced by speckle noises and it has poor image quality. This paper proposes a combination of local phase symmetry and quadratic polynomial fitting methods to extract bone outer contour (BOC) from two dimensional (2D) B-modes US image as initial steps of three-dimensional (3D) bone surface reconstruction. By using local phase symmetry, the bone is initially extracted from US images. BOC is then extracted by scanning one pixel on the bone boundary in each column of the US images using first phase features searching method. Quadratic polynomial fitting is utilized to refine and estimate the pixel location that fails to be detected during the extraction process. Hole filling method is then applied by utilize the polynomial coefficients to fill the gaps with new pixel. The proposed method is able to estimate the new pixel position and ensures smoothness and continuity of the contour path. Evaluations are done using cow and goat bones by comparing the resulted BOCs with the contours produced by manual segmentation and contours produced by canny edge detection. The evaluation shows that our proposed methods produces an excellent result with average MSE before and after hole filling at the value of 0.65.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Acne vulgaris, commonly called as acne, is a skin problem that occurs when oil and dead skin cells clog up in a person’s pores. This is because hormones change which makes the skin oilier. The problem is people really do not know the real assessment of sensitivity of their skin in terms of fluid development on their faces that tends to develop acne vulgaris, thus having more complications. This research aims to assess Acne Vulgaris using luminescent visualization system through optical imaging and integration of image processing algorithms. Specifically, this research aims to design a prototype for facial fluid analysis using luminescent visualization system through optical imaging and integration of fluorescent imaging system, and to classify different facial fluids present in each person. Throughout the process, some structures and layers of the face will be excluded, leaving only a mapped facial structure with acne regions. Facial fluid regions are distinguished from the acne region as they are characterized differently.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Automated analysis of histological images helps diagnose and further classify breast cancer. Totally automated approaches can be used to pinpoint images for further analysis by the medical doctor. But tissue images are especially challenging for either manual or automated approaches, due to mixed patterns and textures, where malignant regions are sometimes difficult to detect unless they are in very advanced stages. Some of the major challenges are related to irregular and very diffuse patterns, as well as difficulty to define winning features and classifier models. Although it is also hard to segment correctly into regions, due to the diffuse nature, it is still crucial to take low-level features over individualized regions instead of the whole image, and to select those with the best outcomes. In this paper we report on our experiments building a region classifier with a simple subspace division and a feature selection model that improves results over image-wide and/or limited feature sets. Experimental results show modest accuracy for a set of classifiers applied over the whole image, while the conjunction of image division, per-region low-level extraction of features and selection of features, together with the use of a neural network classifier achieved the best levels of accuracy for the dataset and settings we used in the experiments. Future work involves deep learning techniques, adding structures semantics and embedding the approach as a tumor finding helper in a practical Medical Imaging Application.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Visual processing skill is used to gather visual information from environment however, there are cases that Visual Processing Disorder (VPD) occurs. The so called visual figure-ground discrimination is a type of VPD where color is one of the factors that contributes on this type. In line with this, color plays a vital role in everyday living, but individuals that have limited and inaccurate color perception suffers from Color Vision Deficiency (CVD) and still not aware on their case. To resolve this case, this study focuses on the design of KULAY, a Head-Mounted Display (HMD) device that can assess whether a user has a CVD or not thru the standard Hardy-Rand-Rittler (HRR) test. This test uses pattern recognition in order to evaluate the user. In addition, color vision deficiency simulation and color correction thru color transformation is also a concern of this research. This will enable people with normal color vision to know how color vision deficient perceives and vice-versa. For the accuracy of the simulated HRR assessment, its results were validated thru an actual assessment done by a doctor. Moreover, for the preciseness of color transformation, Structural Similarity Index Method (SSIM) was used to compare the simulated CVD images and the color corrected images to other reference sources. The output of the simulated HRR assessment and color transformation shows very promising results indicating effectiveness and efficiency of the study. Thus, due to its form factor and portability, this device is beneficial in the field of medicine and technology.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Mango production is highly vital in the Philippines. It is very essential in the food industry as it is being used in markets and restaurants daily. The quality of mangoes can affect the income of a mango farmer, thus incorrect time of harvesting will result to loss of quality mangoes and income. Scientific farming is much needed nowadays together with new gadgets because wastage of mangoes increase annually due to uncouth quality. This research paper focuses on profiling and sorting of Mangifera Indica using image processing techniques and pattern recognition. The image of a mango is captured on a weekly basis from its early stage. In this study, the researchers monitor the growth and color transition of a mango for profiling purposes. Actual dimensions of the mango are determined through image conversion and determination of pixel and RGB values covered through MATLAB. A program is developed to determine the range of the maximum size of a standard ripe mango. Hue, light, saturation (HSL) correction is used in the filtering process to assure the exactness of RGB values of a mango subject. By pattern recognition technique, the program can determine if a mango is standard and ready to be exported.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A constructive solid geometry(CSG) tree model is proposed to progressively abstract 3D geometric shape of general object from 2D image. Unlike conventional ones, our method applies to general object without the need for massive CAD models, and represents the object shapes in a coarse-to-fine manner that allows users to view temporal shape representations at any time. It stands in a transitional position between 2D image feature and CAD model, benefits from state-of-the-art object detection approaches and better initializes CAD model for finer fitting, estimates 3D shape and pose parameters of object at different levels according to visual perception objective, in a coarse-to-fine manner. Two main contributions are the application of CSG building up procedure into visual perception, and the ability of extending object estimation result into a more flexible and expressive model than 2D/3D primitive shapes. Experimental results demonstrate the feasibility and effectiveness of the proposed approach.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Lane markings detection is a very important part of the ADAS to avoid traffic accidents. In order to obtain accurate lane markings, in this work, a novel and efficient algorithm is proposed, which analyses the waveform generated from the road image after inverse perspective mapping (IPM). The algorithm includes two main stages: the first stage uses an image preprocessing including a CNN to reduce the background and enhance the lane markings. The second stage obtains the waveform of the road image and analyzes the waveform to get lanes. The contribution of this work is that we introduce local and global features of the waveform to detect the lane markings. The results indicate the proposed method is robust in detecting and fitting the lane markings.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
QR (Quick Response) code is a kind of two dimensional barcode that was first developed in automotive industry. Nowadays, QR code has been widely used in commercial applications like product promotion, mobile payment, product information management, etc. Traditional QR codes in accordance with the international standard are reliable and fast to decode, but are lack of aesthetic appearance to demonstrate visual information to customers. In this work, we present a novel interactive method to generate aesthetic QR code. By given information to be encoded and an image to be decorated as full QR code background, our method accepts interactive user's strokes as hints to remove undesired parts of QR code modules based on the support of QR code error correction mechanism and background color thresholds. Compared to previous approaches, our method follows the intention of the QR code designer, thus can achieve more user pleasant result, while keeping high machine readability.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Online educational resources, such as MOOCs, is becoming increasingly popular, especially in higher education field. One most important media type for MOOCs is course video. Besides traditional bottom-position subtitle accompany to the videos, in recent years, researchers try to develop more advanced algorithms to generate speaker-following style subtitles. However, the effectiveness of such subtitle is still unclear. In this paper, we investigate the relationship between subtitle position and the learning effect after watching the video on tablet devices. Inspired with image based human eye tracking technique, this work combines the objective gaze estimation statistics with subjective user study to achieve a convincing conclusion -- speaker-following subtitles are more suitable for online educational videos.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper proposed a single image super-resolution algorithm based on image patch classification and sparse representation where gradient information is used to classify image patches into three different classes in order to reflect the difference between the different types of image patches. Compared with other classification algorithms, gradient information based algorithm is simpler and more effective. In this paper, each class is learned to get a corresponding sub-dictionary. High-resolution image patch can be reconstructed by the dictionary and sparse representation coefficients of corresponding class of image patches. The result of the experiments demonstrated that the proposed algorithm has a better effect compared with the other algorithms.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this work, we studied a strategy for training a convolutional neural network in pedestrian gender classification with limited amount of labeled training data. Unsupervised learning by k-means clustering on pedestrian images was used to learn the filters to initialize the first layer of the network. As a form of pre-training, supervised learning for the related task of pedestrian classification was performed. Finally, the network was fine-tuned for gender classification. We found that this strategy improved the network’s generalization ability in gender classification, achieving better test results when compared to random weights initialization and slightly more beneficial than merely initializing the first layer filters by unsupervised learning. This shows that unsupervised learning followed by pre-training with pedestrian images is an effective strategy to learn useful features for pedestrian gender classification.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Due to the improvement of the picture quality of closed-circuit television (CCTV), the demand for CCTV has increased rapidly and its market size has also increased. The current system structure of CCTV transfers compressed images without analysis received from CCTV to a control center. The compressed images are suitable for the evidence required for a criminal arrest, but they cannot prevent crime in real time, which has been considered a limitation. Thus, the present paper proposes a system implementation that can prevent crimes by applying a situation awareness system at the back end of the CCTV cameras for image acquisition to prevent crimes efficiently. In the system implemented in the present paper, the region of interest (ROI) is set virtually within the image data when a barrier, such as fence, cannot be installed in actual sites and unauthorized intruders are tracked constantly through data analysis and recognized in the ROI via the developed algorithm. Additionally, a searchlight or alarm sound is activated to prevent crime in real time and the urgent information is transferred to the control center. The system was implemented in the Raspberry Pi 2 board to be run in real time. The experiment results showed that the recognition success rate was 85% or higher and the track accuracy was 90% or higher. By utilizing the system, crime prevention can be achieved by implementing a social safety network.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Error Concealment (EC) is a technique at the decoder side to hide the transmission errors. It is done by analyzing the spatial or temporal information from available video frames. It is very important to recover distorted video because they are used for various applications such as video-telephone, video-conference, TV, DVD, internet video streaming, video games etc .Retransmission-based and resilient-based methods, are also used for error removal. But these methods add delay and redundant data. So error concealment is the best option for error hiding. In this paper, the error concealment methods such as Block Matching error concealment algorithm is compared with Frequency Selective Extrapolation algorithm. Both the works are based on concealment of manually error video frames as input. The parameter used for objective quality measurement was PSNR (Peak Signal to Noise Ratio) and SSIM(Structural Similarity Index). The original video frames along with error video frames are compared with both the Error concealment algorithms. According to simulation results, Frequency Selective Extrapolation is showing better quality measures such as 48% improved PSNR and 94% increased SSIM than Block Matching Algorithm.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Despite previous studies, tagging and indexing the product images remain challenging due to the large inner-class variation of the products. In the traditional methods, the quantized hand-crafted features such as SIFTs are extracted as the representation of the product images, which are not discriminative enough to handle the inner-class variation. For discriminative image representation, this paper firstly presents a novel deep convolutional neural networks (DCNNs) architect true pre-trained on a large-scale general image dataset. Compared to the traditional features, our DCNNs representation is of more discriminative power with fewer dimensions. Moreover, we incorporate the part-based model into the framework to overcome the negative effect of bad alignment and cluttered background and hence the descriptive ability of the deep representation is further enhanced. Finally, we collect and contribute a well-labeled shoe image database, i.e., the TBShoes, on which we apply the part-based deep representation for product image tagging and search, respectively. The experimental results highlight the advantages of the proposed part-based deep representation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We developed the Bayesian AutoEncoder (BAE) to construct a multi-layer restricted Bayesian Network by extracting features from a training dataset. Networks constructed using BAE have hidden variables that represent features of the data and can execute inferences for each feature. In this paper, we show that a network constructed by BAE can not only recognize features but can also fill in lacking data. We performed experiments and confirmed this filling-in ability.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
An iterative reweighted least squares (IRLS) algorithm is presented in this paper for the minimax design of FIR filters. In the algorithm, the resulted subproblems generated by the weighted least squares (WLS) are solved by using the conjugate gradient (CG) method instead of the time-consuming matrix inversion method. An almost minimax solution for filter design is consequently obtained. This solution is found to be very efficient compared with most existing algorithms. Moreover, the filtering solution is flexible enough for extension towards a broad range of filter designs, including constrained filters. Two design examples are given and the comparison with other existing algorithms shows the excellent performance of the proposed algorithm.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Real time eye tracking has numerous applications in human computer interaction such as a mouse cursor control in a computer system. It is useful for persons with muscular or motion impairments. However, tracking the movement of the eye is complicated by occlusion due to blinking, head movement, screen glare, rapid eye movements, etc. In this work, we present the algorithmic and construction details of a real time eye tracking system. Our proposed system is an extension of Spatio-Temporal context learning through Kalman Filtering. Spatio-Temporal Context Learning offers state of the art accuracy in general object tracking but its performance suffers due to object occlusion. Addition of the Kalman filter allows the proposed method to model the dynamics of the motion of the eye and provide robust eye tracking in cases of occlusion. We demonstrate the effectiveness of this tracking technique by controlling the computer cursor in real time by eye movements.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In recent years, Unmanned Aerial Vehicles (UAVs) have increasingly been used in many civil applications. However, they also pose a significant threat in restricted zones. Radar can be used to detect and discriminate UAVs. Due to the low flying altitude of the UAVs, it is found that the radar signals also include some unwanted echoes, reflected by building, ground, trees and grasses etc. Consequently, it has not been possible to get the clean UAVs characteristics for further classification. In this paper, the MTI filter is applied to cancel the ground clutter and based this, an improved MTI filter is further proposed. Compared with the traditional MTI filter, the improved one significantly enhances ground clutter rejection capability while maintaining most of the target power. As the result, the cleaner UAVs classification characteristics can be obtained. The effectiveness of the proposed method has been verified by an experimental CW radar dataset, collected from a helicopter UAV.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Numerous fuzzy pattern mining methods have been proposed to address the uncertainty and incompleteness of quantitative data. Traditional fuzzy pattern mining methods generally have to transform the original quantitative values into either crystal items or fuzzy regions first, which is hard to apply without comprehensive domain knowledge. In addition, existing numerical pattern mining methods generally suffer high computational cost. Inspired by the above problems, we put forward an efficient maximal approximate numerical frequent pattern mining (MANFPM) method without fuzzy item or region specification. Experimental results have validated its scalability and effectiveness for application in emitter entity resolution.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Recent research suggests that meditation affects the structure and function of the brain. Cognitive load can be handled in effective way by the meditators. EEG signals are used to quantify cognitive load. The research of investigating effect of meditation on cognitive workload using EEG signals in pre and post-meditation is an open problem. The subjects for this study are young healthy 11 engineering students from our institute. The focused attention meditation practice is used for this study. EEG signals are recorded at the beginning of meditation and after four weeks of regular meditation using EMOTIV device. The subjects practiced meditation daily 20 minutes for 4 weeks. The 7 level arithmetic additions of single digit (low level) to three digits with carry (high level) are presented as cognitive load. The cognitive load indices such as arousal index, performance enhancement, neural activity, load index, engagement, and alertness are evaluated in pre and post meditation. The cognitive indices are improved in post meditation data. Power Spectral Density (PSD) feature is compared between pre and post-meditation across all subjects. The result hints that the subjects were handling cognitive load without stress (ease of cognitive functioning increased for the same load) after 4 weeks of meditation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The research aims to build a tool in assessing patients for post-traumatic stress disorder or PTSD. The parameters used are heart rate, skin conductivity, and facial gestures. Facial gestures are recorded using OpenFace, an open-source face recognition program that uses facial action units in to track facial movements. Heart rate and skin conductivity is measured through sensors operated using Raspberry Pi. Results are stored in a database for easy and quick access. Databases to be used are uploaded to a cloud platform so that doctors have direct access to the data. This research aims to analyze these parameters and give accurate assessment of the patient.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Due to its capability of capturing the kinematic properties of a target object, radar micro-Doppler signatures (m-DS) play an important role in radar target classification. This is particularly evident from the remarkable number of research papers published every year on m-DS for various applications. However, most of these works rely on the support vector machine (SVM) for target classification. It is well known that training an SVM is computationally expensive due to its nature of search to locate the supporting vectors. In this paper, the classifier learning problem is addressed by a total error rate (TER) minimization where an analytic solution is available. This largely reduces the search time in the learning phase. The analytically obtained TER solution is globally optimal with respect to the classification total error count rate. Moreover, our empirical results show that TER outperforms SVM in terms of classification accuracy and computational efficiency on a five-category radar classification problem.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Strong Jumping Emerging Patterns (SJEPs) are data mining patterns which have strong discriminating abilities in classification. However, SJEPs mining algorithms in current years are usually achieved by the data structure, tree. These existing algorithms using the tree structure are difficult to achieve excellent performance. In this paper, we propose a novel method of mining SJEPs named PPSJEP. This algorithm is based on a novel data structure called NSJEP-list, which is improved from the N-list. We use the NSJEP-lists to replace the tree structure. First, we get the individual items’ NSJEP-lists from the tree. Then we use the intersection of NSJEP-lists to get the longer itemsets’ NSJEP-lists which includes the information of the position and the count in each class. And we mine the SJEPs through the information. Experiments are performed on six UCI datasets. Compared with existing algorithm in running time and classification accuracy, the experimental results show that our algorithm uses less time to mine SJEPs and get the same classification accuracy, especially in lower minimum support threshold.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The recent proposed Indexing-First-One (IFO) hashing is a latest technique that is particularly adopted for eye iris template protection, i.e. IrisCode. However, IFO employs the measure of Jaccard Similarity (JS) initiated from Min-hashing has yet been adequately discussed. In this paper, we explore the nature of JS in binary domain and further propose a mathematical formulation to generalize the usage of JS, which is subsequently verified by using CASIA v3-Interval iris database. Our study reveals that JS applied in IFO hashing is a generalized version in measure two input objects with respect to Min-Hashing where the coefficient of JS is equal to one. With this understanding, IFO hashing can propagate the useful properties of Min-hashing, i.e. similarity preservation, thus favorable for similarity searching or recognition in binary space.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Video sensor data has been widely used in automatic surveillance applications. In this study, we present a method that automatically detects pigs in a pig room by using depth information obtained from a Kinect sensor. For a real-time implementation, we propose a means of reducing the execution time by applying parallel processing techniques. In general, most parallel processing techniques have been used to parallelize a specific task. In this study, we consider parallelization of an entire system that consists of several tasks. By applying a scheduling strategy to identify a computing device for each task and implementing it with OpenCL, we can reduce the total execution time efficiently. Experimental results reveal that the proposed method can automatically detect pigs using a CPU-GPU hybrid system in real time, regardless of the relative performance between the CPU and GPU.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Deep learning is a very hot topic currently in pattern recognition and artificial intelligence researches. Aiming at the practical problem that people usually don’t know correct classifications some rubbish should belong to, based on the powerful image classification ability of the deep learning method, we have designed a prototype system to help users to classify kinds of rubbish. Firstly the CaffeNet Model was adopted for our classification network training on the ImageNet dataset, and the trained network was deployed on a web server. Secondly an android app was developed for users to capture images of unclassified rubbish, upload images to the web server for analyzing backstage and retrieve the feedback, so that users can obtain the classification guide by an android device conveniently. Tests on our prototype system of rubbish classification show that: an image of one single type of rubbish with origin shape can be better used to judge its classification, while an image containing kinds of rubbish or rubbish with changed shape may fail to help users to decide rubbish’s classification. However, the system still shows promising auxiliary function for rubbish classification if the network training strategy can be optimized further.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Curriculum learning is a learning technique in which a classifier learns from easy samples first and then from increasingly difficult samples. On similar lines, a curriculum based feature selection framework is proposed for identifying most useful features in a dataset. Given a dataset, first, easy and difficult samples are identified. In general, the number of easy samples is assumed larger than difficult samples. Then, feature selection is done in two stages. In the first stage a fast feature selection method which gives feature scores is used. Feature scores are then updated incrementally with the set of difficult samples. The existing feature selection methods are not incremental in nature; entire data needs to be used in feature selection. The use of curriculum learning is expected to decrease the time needed for feature selection with classification accuracy comparable to the existing methods. Curriculum learning also allows incremental refinements in feature selection as new training samples become available. Our experiments on a number of standard datasets demonstrate that feature selection is indeed faster without sacrificing classification accuracy.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.