PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.
This PDF file contains the front matter associated with SPIE Proceedings Volume 10025, including the Title Page, Copyright information, Table of Contents, Introduction (if any), and Conference Committee listing.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Batik is one of Indonesian’s traditional cloth. Motif or pattern drawn on a piece of batik fabric has a specific name and philosopy. Although batik cloths are widely used in everyday life, but only few people understand its motif and philosophy. This research is intended to develop a batik motif recognition system which can be used to identify motif of Batik image automatically. First, a batik image is decomposed into sub-images using wavelet transform. Six texture descriptors, i.e. max probability, correlation, contrast, uniformity, homogenity and entropy, are extracted from gray-level co-occurrence matrix of each sub-image. The texture features are then matched to the template features using canberra distance. The experiment is performed on Batik Dataset consisting of 1088 batik images grouped into seven motifs. The best recognition rate, that is 92,1%, is achieved using feature extraction process with 5 level wavelet decomposition and 4 directional gray-level co-occurrence matrix.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Template matching is a basic algorithm for image processing, and real-time is a crucial requirement of object tracking. For real-time tracking, a fast template matching algorithm based on grey prediction is presented, where computation cost can be reduced dramatically by minimizing search range. First, location of the tracked object in the current image is estimated by Grey Model (GM). GM(1,1), which is the basic model of grey prediction, can use some known information to foretell the location. Second, the precise position of the object in the frame is computed by template matching. Herein, Sequential Similarity Detection Algorithm (SSDA) with a self-adaptive threshold is employed to obtain the matching position in the neighborhood of the predicted location. The role of threshold in SSDA is important, as a proper threshold can make template matching fast and accurate. Moreover, a practical weighted strategy is utilized to handle scale and rotation changes of the object, as well as illumination changes. The experimental results show the superior performance of the proposed algorithm over the conventional full-search method, especially in terms of executive time.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Object tracking is a challenging research task due to target appearance variation caused by deformation and occlusion. Keypoint matching based tracker can handle partial occlusion problem, but it’s vulnerable to matching faults and inflexible to target deformation. In this paper, we propose an innovative keypoint matching procedure to address above issues. Firstly, the scale and orientation of corresponding keypoints are applied to estimate the target’s status. Secondly, a kernel function is employed in order to discard the mismatched keypoints, so as to improve the estimation accuracy. Thirdly, the model updating mechanism is applied to adapt to target deformation. Moreover, in order to avoid bad updating, backward matching is used to determine whether or not to update target model. Extensive experiments on challenging image sequences show that our method performs favorably against state-of-the-art methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper describes the use of image skeletonization for the estimation of all the navigable points, inside a scene of mobile robots navigation. Those points are used for computing a valid navigation path, using standard methods. The main idea is to find the middle and the extreme points of the obstacles in the scene, taking into account the robot size, and create a map of navigable points, in order to reduce the amount of information for the planning algorithm. Those points are located by means of the skeletonization of a binary image of the obstacles and the scene background, along with some other digital image processing algorithms. The proposed algorithm automatically gives a variable number of navigable points per obstacle, depending on the complexity of its shape. As well as, the way how the algorithm can change some of their parameters in order to change the final number of the resultant key points is shown. The results shown here were obtained applying different kinds of digital image processing algorithms on static scenes.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Many data mining adopts the form of Artificial Neural Network (ANN) to solve many problems, many problems will be involved in the process of training Artificial Neural Network, such as the number of samples with volume label, the time and performance of training, the number of hidden layers and Transfer function, if the compared data results are not expected, it cannot be known clearly that which dimension causes the deviation, the main reason is that Artificial Neural Network trains compared results through the form of modifying weight, and it is not a kind of training to improve the original algorithm for the extraction algorithm of image, but tend to obtain correct value aimed at the result plus the weigh; in terms of these problems, this paper will mainly put forward a method to assist in the image data analysis of Artificial Neural Network; normally, a parameter will be set as the value to extract feature vector during processing the image, which will be considered by us as weight, the experiment will use the value extracted from feature point of Speeded Up Robust Features (SURF) Image as the basis for training, SURF itself can extract different feature points according to extracted values, we will make initial semi-supervised clustering according to these values, and use Modified K - on his Neighbors (MFKNN) as training and classification, the matching mode of unknown images is not one-to-one complete comparison, but only compare group Centroid, its main purpose is to save its efficiency and speed up, and its retrieved data results will be observed and analyzed eventually; the method is mainly to make clustering and classification with the use of the nature of image feature point to give values to groups with high error rate to produce new feature points and put them into Input Layer of Artificial Neural Network for training, and finally comparative analysis is made with Back-Propagation Neural Network (BPN) of Genetic Algorithm-Artificial Neural Network (GAANN through the weight training results.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper proposes a method for removing mismatched lines on multispectral images. The inaccurate detection of ending points brings a great challenge for matching lines since corresponding lines may not be integrally extracted. Due to the inaccurate detection of ending points, lines are usually mismatched with the line description. To eliminate the mismatched lines, we employ a modified RANSAC (Random Sample Consensus) consisting of two steps: (1) pick three line matches randomly and determine their intersections, which are used to calculate a transformation; (2) the best transformation is obtained by sorting the matching score of line matches and then the inliers are declared as the correct matches. Experimental results show that the proposed method can effectively remove incorrect matches on multispectral images.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Solar magnetic structures exhibit a wealth of different spatial and temporal scales. Presently, solar magnetic element is believed to be the ultra-fine magnetic structure in the lower solar atmospheric layer, and the diffraction limit of the largest-aperture solar telescope (New Vacuum Solar Telescope; NVST) of China is close to the spatial scale of magnetic element. This implies that modern solar observations have entered the era of high resolution better than 0.2 arc-second. Since the year of 2011, the NVST have successfully established and obtained huge observational data. Moreover, the ultra-fine magnetic structure rooted in the dark inter-graunlar lanes can be easily resolved. Studies on the observational characteristics and physical mechanism of magnetic bright points is one of the most important aspects in the field of solar physics, so it is very important to determine the statistical and physical parameters of magnetic bright points with the feature extraction techniques and numerical analysis approaches. For identifying such ultra-fine magnetic structure, an automatically and effectively detection algorithm, employed the Laplacian transform and the morphological dilation technique, is proposed and examined. Then, the statistical parameters such as the typical diameter, the area distribution, the eccentricity, and the intensity contrast are obtained. And finally, the scientific meaning for investigating the physical parameters of magnetic bright points are discussed, especially for understanding the physical processes of solar magnetic energy transferred from the photosphere to the corona.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
It is highly important for visually impaired people (VIP) to be aware of human beings around themselves, so correctly recognizing people in VIP assisting apparatus provide great convenience. However, in classical face recognition technology, faces used in training and prediction procedures are usually frontal, and the procedures of acquiring face images require subjects to get close to the camera so that frontal face and illumination guaranteed. Meanwhile, labels of faces are defined manually rather than automatically. Most of the time, labels belonging to different classes need to be input one by one. It prevents assisting application for VIP with these constraints in practice. In this article, a face recognition system under unconstrained environment is proposed. Specifically, it doesn’t require frontal pose or uniform illumination as required by previous algorithms. The attributes of this work lie in three aspects. First, a real time frontal-face synthesizing enhancement is implemented, and frontal faces help to increase recognition rate, which is proved with experiment results. Secondly, RGB-D camera plays a significant role in our system, from which both color and depth information are utilized to achieve real time face tracking which not only raises the detection rate but also gives an access to label faces automatically. Finally, we propose to use neural networks to train a face recognition system, and Principal Component Analysis (PCA) is applied to pre-refine the input data. This system is expected to provide convenient help for VIP to get familiar with others, and make an access for them to recognize people when the system is trained enough.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Biometric identification technology replaces traditional security technology, which has become a trend, and gait recognition also has become a hot spot of research because its feature is difficult to imitate and theft. This paper presents a gait recognition system based on integral outline of human body. The system has three important aspects: the preprocessing of gait image, feature extraction and classification. Finally, using a method of polling to evaluate the performance of the system, and summarizing the problems existing in the gait recognition and the direction of development in the future.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents novel approach to the recognition of mammograms. The analyzed mammograms represent the normal and breast cancer (benign and malignant) cases. The solution applies the deep learning technique in image recognition. To obtain increased accuracy of classification the nonnegative matrix factorization and statistical self-similarity of images are applied. The images reconstructed by using these two approaches enrich the data base and thanks to this improve of quality measures of mammogram recognition (increase of accuracy, sensitivity and specificity). The results of numerical experiments performed on large DDSM data base containing more than 10000 mammograms have confirmed good accuracy of class recognition, exceeding the best results reported in the actual publications for this data base.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Descriptor is the key of any image-based recognition algorithm. For ear recognition, conventional descriptors are either based on 2D data or 3D data. 2D images provide rich texture information and human ear is a 3D surface that could offer shape information. It also inspires us that 2D data is more robust against occlusion while 3D data shows more robustness against illumination variation and pose variation. In this paper, we introduce a novel Texture and Depth Scale Invariant Feature Transform (TDSIFT) descriptor to encode 2D and 3D local features for ear recognition. Compared to the original Scale Invariant Feature Transform (SIFT) descriptor, the proposed TDSIFT shows its superiority by fusing 2D local information and 3D local information. Firstly, keypoints are detected and described on texture images. Then, 3D information of the keypoints located on the corresponding depth images is added to form the TDSIFT descriptor. Finally, a local feature based classification algorithm is adopted to identify ear samples by TDSIFT. Experimental results on a benchmark dataset demonstrate the feasibility and effectiveness of our proposed descriptor. The rank-1 recognition rate achieved on a gallery of 415 persons is 95.9% and the time involved in the computation is satisfactory compared to state-of-the-art methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Fall prevention and detection system have to subjugate many challenges in order to develop an efficient those system. Some of the difficult problems are obtrusion, occlusion and overlay in vision based system. Other associated issues are privacy, cost, noise, computation complexity and definition of threshold values. Estimating human motion using vision based usually involves with partial overlay, caused either by direction of view point between objects or body parts and camera, and these issues have to be taken into consideration. This paper proposes the use of dynamic threshold based and bounding box posture analysis method with multiple Kinect cameras setting for human posture analysis and fall detection. The proposed work only uses two Kinect cameras for acquiring distributed values and differentiating activities between normal and falls. If the peak value of head velocity is greater than the dynamic threshold value, bounding box posture analysis will be used to confirm fall occurrence. Furthermore, information captured by multiple Kinect placed in right angle will address the skeleton overlay problem due to single Kinect. This work contributes on the fusion of multiple Kinect based skeletons, based on dynamic threshold and bounding box posture analysis which is the only research work reported so far.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We present a spine arch analysis method in dairy cows using overhead 3D video data. This method is aimed for early stage lameness detection. That is important in order to allow early treatment; and thus, reduce the animal suffering and minimize the high forecasted financial losses, caused by lameness. Our physical data collection setup is non-intrusive, covert and designed to allow full automation; therefore, it could be implemented on a large scale or daily basis with high accuracy. We track the animal’s spine using shape index and curvedness measure from the 3D surface as she walks freely under the 3D camera. Our spinal analysis focuses on the thoracic vertebrae region, where we found most of the arching caused by lameness. A cubic polynomial is fitted to analyze the arch and estimate the locomotion soundness. We have found more accurate results by eliminating the regular neck/head movements’ effect from the arch. Using 22-cow data set, we are able to achieve an early stage lameness detection accuracy of 95.4%.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In animal behavior research, the main task of observing the behavior of an animal is usually done manually. The measurement of the trajectory of an animal and its real-time posture description is often omitted due to the lack of automatic computer vision tools. Even though there are many publications for pose estimation, few are efficient enough to apply in real-time or can be used without the machine learning algorithm to train a classifier from mass samples. In this paper, we propose a novel strategy for the real-time lobster posture estimation to overcome those difficulties. In our proposed algorithm, we use the Gaussian mixture model (GMM) for lobster segmentation. Then the posture estimation is based on the distance transform and skeleton calculated from the segmentation. We tested the algorithm on a serials lobster videos in different size and lighting conditions. The results show that our proposed algorithm is efficient and robust under various conditions.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we propose a novel feature extraction method for face recognition based on two dimensional fractional Fourier transform (2D-FrFT). First, we extract the phase information of facial image in 2D-FrFT, which is called the generalized phase spectra (GPS). Then, we present an improved two-dimensional separability judgment (I2DSJ) to select appropriate order parameters for discrete fractional Fourier transform. Finally, multiple orders’ generalized phase spectrum bands (MGPSB) fusion is proposed. In order to make full use of the discriminative information from different orders for face recognition, the proposed approach merges different orders’ generalized phase spectra (GPS) of 2D-FrFT. The proposed method is no need to construct the subspace through the feature extraction methods and has less computation cost. Experimental results on the public face databases demonstrate that our method outperforms the representative methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We present in this paper a new approach for Arabic sign language (ArSL) alphabet recognition using hand gesture analysis. This analysis consists in extracting a histogram of oriented gradient (HOG) features from a hand image and then using them to generate an SVM Models. Which will be used to recognize the ArSL alphabet in real-time from hand gesture using a Microsoft Kinect camera. Our approach involves three steps: (i) Hand detection and localization using a Microsoft Kinect camera, (ii) hand segmentation and (iii) feature extraction using Arabic alphabet recognition. One each input image first obtained by using a depth sensor, we apply our method based on hand anatomy to segment hand and eliminate all the errors pixels. This approach is invariant to scale, to rotation and to translation of the hand. Some experimental results show the effectiveness of our new approach. Experiment revealed that the proposed ArSL system is able to recognize the ArSL with an accuracy of 90.12%.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A different approach of sign language recognition of static and dynamic hand movements was developed in this study using normalized correlation algorithm. The goal of this research was to translate fingerspelling sign language into text using MATLAB and Microsoft Kinect. Digital input image captured by Kinect devices are matched from template samples stored in a database. This Human Computer Interaction (HCI) prototype was developed to help people with communication disability to express their thoughts with ease. Frame segmentation and feature extraction was used to give meaning to the captured images. Sequential and random testing was used to test both static and dynamic fingerspelling gestures. The researchers explained some factors they encountered causing some misclassification of signs.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This research explored the creation of a model to detect emotion from Filipino songs. The emotion model used was based from Paul Ekman’s six basic emotions. The songs were classified into the following genres: kundiman, novelty, pop, and rock. The songs were annotated by a group of music experts based on the emotion the song induces to the listener. Musical features of the songs were extracted using jAudio while the lyric features were extracted by Bag-of- Words feature representation. The audio and lyric features of the Filipino songs were extracted for classification by the chosen three classifiers, Naïve Bayes, Support Vector Machines, and k-Nearest Neighbors. The goal of the research was to know which classifier would work best for Filipino music. Evaluation was done by 10-fold cross validation and accuracy, precision, recall, and F-measure results were compared. Models were also tested with unknown test data to further determine the models’ accuracy through the prediction results.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Febus Reidj G. Cruz, Glenn O. Avendaño, Cyrel O. Manlises, James Jason G. Avellanosa, Jyacinth Camille F. Abina, Albert M. Masaquel, Michael Lance O. Siapno, Wen-Yaw Chung
Proceedings Volume Eighth International Conference on Graphic and Image Processing (ICGIP 2016), 102250K (2017) https://doi.org/10.1117/12.2266932
Disasters such as typhoons, tornadoes, and earthquakes are inevitable. Aftermaths of these disasters include the missing people. Using robots with human detection capabilities to locate the missing people, can dramatically reduce the harm and risk to those who work in such circumstances. This study aims to: design and build a tele-operated robot; implement in MATLAB an algorithm for the detection of humans; and create a database of human identification based on various positions, angles, light intensity, as well as distances from which humans will be identified. Different light intensities were made by using Photoshop to simulate smoke, dust and water drops conditions. After processing the image, the system can indicate either a human is detected or not detected. Testing with bodies covered was also conducted to test the algorithm’s robustness. Based on the results, the algorithm can detect humans with full body shown. For upright and lying positions, detection can happen from 8 feet to 20 feet. For sitting position, detection can happen from 2 feet to 20 feet with slight variances in results because of different lighting conditions. The distances greater than 20 feet, no humans can be processed or false negatives can occur. For bodies covered, the algorithm can detect humans in cases made under given circumstances. On three positions, humans can be detected from 0 degrees to 180 degrees under normal, with smoke, with dust, and with water droplet conditions. This study was able to design and build a tele-operated robot with MATLAB algorithm that can detect humans with an overall precision of 88.30%, from which a database was created for human identification based on various conditions, where humans will be identified.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
As possible active sites, the concave and convex feature regions of the molecule are the locations where the molecular docking will happen more possibly. Then how to search for those regions is valuable to study. In this paper, a new method is proposed for identifying concave and convex regions. Based on the established spherical mapping between molecular surfaces and its bounding-sphere surfaces, the concave and convex vertices of local areas can be determined according to the expansion distance defined by the spherical mapping. Then through mesh growing, a feature region can be firmed by a concave point or a convex point, also called center point, and its neighboring faces, whose normal vector has an angle in a specified range with the center point. After that, areas and volumes of feature regions are calculated. The experimental results indicate that the method can well identify the concave and convex characteristics of the molecule.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Jessie R. Balbin, Dionis A. Padilla, Janette C. Fausto, Ernesto M. Vergara Jr., Ramon G. Garcia, Bethsedea Joy S. Delos Angeles, Neil John A. Dizon, Mark Kevin N. Mardo
Proceedings Volume Eighth International Conference on Graphic and Image Processing (ICGIP 2016), 102250M (2017) https://doi.org/10.1117/12.2266986
This research is about translating series of hand gesture to form a word and produce its equivalent sound on how it is read and said in Filipino accent using Support Vector Machine and Mel Frequency Cepstral Coefficient analysis. The concept is to detect Filipino speech input and translate the spoken words to their text form in Filipino. This study is trying to help the Filipino deaf community to impart their thoughts through the use of hand gestures and be able to communicate to people who do not know how to read hand gestures. This also helps literate deaf to simply read the spoken words relayed to them using the Filipino speech to text system.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper shows a technique based on CHT (Circle Hough Transform) to achieve the optical Braille recognition (OBR). Unlike other papers developed around the same topic, this one is made by using Hough Transform to process the recognition and transcription of Braille cells, proving CHT to be an appropriate technique to go over different non-systematics factors who can affect the process, as the paper type where the text to traduce is placed, some lightning factors, input image resolution and some flaws derived from the capture process, which is realized using a scanner. Tests are performed with a local database using text generated by visual nondisabled people and some transcripts by sightless people; all of this with the support of National Institute for Blind People (INCI for their Spanish acronym) placed in Colombia.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The 1-D mapping is an intensity-based method used to estimate a projective transformation between two images. However, it lacks an intensity-invariant criterion for deciding whether two images can be aligned or not. The paper proposes a novel decision criterion and, thus, develops an error-detective 1-D mapping method. First, a multiple 1-D mapping scheme is devised for yielding redundant estimates of an image transformation. Then, a voting scheme is proposed for verifying these multiple estimates, in which at least one estimate without receiving all the votes is taken as a decision criterion for false-match rejection. Based on the decision criterion, an error-detective 1-D mapping algorithm is also constructed. Finally, the proposed algorithm is evaluated in registering real image pairs with a large range of projective transformations.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The scale-invariant feature transform (SIFT) algorithm is devised to detect keypoints via the difference of Gaussian (DoG) images. However, the DoG data lacks the high-frequency information, which can lead to a performance drop of the algorithm. To address this issue, this paper proposes a novel log-polar feature detector (LPFD) to detect scale-invariant blubs (keypoints) in log-polar space, which, in contrast, can retain all the image information. The algorithm consists of three components, viz. keypoint detection, descriptor extraction and descriptor matching. Besides, the algorithm is evaluated in detecting keypoints from the INRIA dataset by comparing with the SIFT algorithm and one of its fast versions, the speed up robust features (SURF) algorithm in terms of three performance measures, viz. correspondences, repeatability, correct matches and matching score.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Activity-dependent changes in the synaptic connections of the brain are tightly related to learning and memory. Previous studies have shown that essentially all new synaptic contacts were made by adding new partners to existing synaptic elements. To further explore synaptic dynamics in specific pathways, concurrent imaging of pre and postsynaptic structures in identified connections is required. Consequently, considerable attention has been paid for the automated detection of axonal boutons. Different from most previous methods proposed in vitro data, this paper considers a more practical case in vivo neuron images which can provide real time information and direct observation of the dynamics of a disease process in mouse. Additionally, we present an automated approach for detecting axonal boutons by starting with deconvolving the original images, then thresholding the enhanced images, and reserving the regions fulfilling a series of criteria. Experimental result in vivo two-photon imaging of mouse demonstrates the effectiveness of our proposed method.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Underwater images are blurred due to light scattering and absorption. Image restoration is therefore important in many underwater research and practical tasks. In this paper, we propose an effective two-stage method to restore underwater scene images. Based on an underwater light propagation model, we first remove backscatter by fitting a binary quadratic function. Then we eliminate the forward scattering and non-uniform lighting attenuation using blue-green dark channel prior. The proposed method requires no additional calibration and we show its effectiveness and robustness by restoring images captured under various underwater scenes.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Recent medical virtual reality (VR) applications to minimize re-operations are being studied for improvements in surgical efficiency and reduction of operation error. The CT image acquisition method considering three-dimensional (3D) modeling for medical VR applications is important, because the realistic model is required for the actual human organ. However, the research for medical VR applications has focused on 3D modeling techniques and utilized 3D models. In addition, research on a CT image acquisition method considering 3D modeling has never been reported. The conventional CT image acquisition method involves scanning a limited area of the lesion for the diagnosis of doctors once or twice. However, the medical VR application is required to acquire the CT image considering patients’ various postures and a wider area than the lesion. A wider area than the lesion is required because of the necessary process of comparing bilateral sides for dyskinesia diagnosis of the shoulder, pelvis, and leg. Moreover, patients’ various postures are required due to the different effects on the musculoskeletal system. Therefore, in this paper, we perform a comparative experiment on the acquired CT images considering image area (unilateral/bilateral) and patients’ postures (neutral/abducted). CT images are acquired from 10 patients for the experiments, and the acquired CT images are evaluated based on the length per pixel and the morphological deviation. Finally, by comparing the experiment results, we evaluate the CT image acquisition method for medical VR applications.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The use of fisheye image stitching is a popular method to produce panoramic images rapidly. A fisheye lens with a large angle can reduce the number of times of image acquisition, but distortion makes the registration of images difficult. As a key determinant of the effect of fisheye image mosaic quality, the estimation of external parameters of images has become a core issue in panorama generation. Previously established methods require additional measurement devices or are only suitable for specific scenarios. In this study, we develop an image mosaic method. A large-angle fisheye lens is adopted for conformal projection and spherical matching. First, to facilitate the search for corresponding corners, the source images are rotated by 90°. Then, conformal projection transformation is utilized to present the feature structures of the overlap area of two images in a consistent manner. Second, the searched corner points are projected onto the unit sphere, and the motion parameters are estimated with the direct linear transform method. Experimental results show that our method can efficiently complete image registration with a large angle between fisheye images. Hence, a full-view panoramic image of 360°×180° can be obtained with the use of two fisheye cameras with an angle of more than 180°.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, in order to classify liver cirrhosis on regions of interest (ROIs) images from B-mode ultrasound images, we have proposed to use the higher order local autocorrelation (HLAC) features. In a previous study, we tried to classify liver cirrhosis by using a Gabor filter based approach. However, the classification performance of the Gabor feature was poor from our preliminary experimental results. In order accurately to classify liver cirrhosis, we examined to use the HLAC features for liver cirrhosis classification. The experimental results show the effectiveness of HLAC features compared with the Gabor feature. Furthermore, by using a binary image made by an adaptive thresholding method, the classification performance of HLAC features has improved.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Gray Level Co-occurrence Matrix (GLCM) is one of the main techniques for texture analysis that has been widely used in many applications. Conventional GLCMs usually focus on two-dimensional (2D) image texture analysis only. However, a three-dimensional (3D) image volume requires specific texture analysis computation. In this paper, an extended 2D to 3D GLCM approach based on the concept of multiple 2D plane positions and pixel orientation directions in the 3D environment is proposed. The algorithm was implemented by breaking down the 3D image volume into 2D slices based on five different plane positions (coordinate axes and oblique axes) resulting in 13 independent directions, then calculating the GLCMs. The resulted GLCMs were averaged to obtain normalized values, then the 3D texture features were calculated. A preliminary examination was performed on a 3D image volume (64 x 64 x 64 voxels). Our analysis confirmed that the proposed technique is capable of extracting the 3D texture features from the extended GLCMs approach. It is a simple and comprehensive technique that can contribute to the 3D image analysis.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we propose two new problems related to classification of photographed document images, and based on deep learning methods, present the baseline solutions for these two problems. The first problem is that, for some photographed document images, which book do they belong to? The second one is, for some photographed document images, what is the type of the book they belong to? To address these two problems, we apply “AexNet” to the collected document images. Using the pre-trained “AlexNet” on the ImageNet data set directly, we obtain 92.57% accuracy for the book-name classification and 93.33% accuracy for the book-type one. After fine-tuning on the training set of the photographed document images, the accuracy of the book-name classification increases to 95.54% and that of the booktype one to 95.42%. To our best knowledge, although there exist many image classification algorithm, no previous work has targeted to these two challenging problems. In addition, the experiments demonstrate that deep-learning features outperform features extracted with traditional image descriptors on these two problems.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents a new data compilation of the image transformation base on the series of the trace transform. The advantage of this new compilation is to solve the noise problem that usually appears in the pattern recognition. Submicro pattern analysis is employ to encode the series of the trace transform from a image which organizes by the shift-invariant sub-micro pattern scheme (SiSMP). Then, all data will sum to the final result by 2-D histogram with the Discriminant Feature Transform. The experiments of our approach show that the new compilation can increase the performance of the recognition compare with the LBP technique by using Brodatz texture database.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Pyramid Histogram of Words (PHOW), combined Bag of Visual Words (BoVW) with the spatial pyramid matching (SPM) in order to add location information to extracted features. However, different PHOW extracted from various color spaces, and they did not extract color information individually, that means they discard color information, which is an important characteristic of any image that is motivated by human vision. This article, concatenated PHOW Multi-Scale Dense Scale Invariant Feature Transform (MSDSIFT) histogram and a proposed Color histogram to improve the performance of existing image classification algorithms. Performance evaluation on several datasets proves that the new approach outperforms other existing, state-of-the-art methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Multi-label image annotation (MIA) has been widely studied during recent years and many MIA schemes have been proposed. However, the most existing schemes are not satisfactory. In this paper, an improved multiple kernel learning (IMKL) method of support vector machine (SVM) is proposed to improve the classification accuracy of SVM, then a novel MIA scheme based on IMKL is presented, which uses the discriminant loss to control the number of top semantic labels, and the feature selection approach is also used for improving the performance of MIA. The experiment results show that proposed MIA scheme achieves higher the performance than the existing other MIA schemes, its performance is satisfactory for large image dataset.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Accurate information extraction from images can only be realised if the data is blur free and contains no artificial artefacts. In astronomical images, apart from hardware limitations, biases are introduced by phenomena beyond control such as atmospheric turbulence. The induced blur function does vary in both time and space depending on the astronomical “seeing” conditions as well as the wavelengths being recorded. Multi-frame blind image deconvolution attempts to recover a sharp latent image from an image sequence of blurry and noisy observations without knowledge of the blur applied to each image within the recorded sequence. Finding a solution to this inverse problem that estimates the original scene from convolved data is a heavily ill-posed problem. In this paper we describe a novel multi-frame blind deconvolution algorithm, that performs image restoration by recovering the frequency and phase information of the latent sharp image in two separate steps. For every given image in the sequence a point-spread function (PSF) is estimated that allows iterative refinement of our latent sharp image estimate. The datasets generated for testing purposes assume Moffat or complex Kolmogorov blur kernels. The results from our implemented prototype are promising and encourage further research.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper we propose a technique for obtaining a high resolution (HR) image from a single low resolution (LR) image -using joint learning dictionary - on the basis of image statistic research. It suggests that with an appropriate choice of an over-complete dictionary, image patches can be well represented as a sparse linear combination. Medical imaging for clinical analysis and medical intervention is being used for creating visual representations of the interior of a body, as well as visual representation of the function of some organs or tissues (physiology). A number of medical imaging techniques are in use like MRI, CT scan, X-rays and Optical Coherence Tomography (OCT). OCT is one of the new technologies in medical imaging and one of its uses is in ophthalmology where it is being used for analysis of the choroidal thickness in the eyes in healthy and disease states such as age-related macular degeneration, central serous chorioretinopathy, diabetic retinopathy and inherited retinal dystrophies. We have proposed a technique for enhancing the OCT images which can be used for clearly identifying and analyzing the particular diseases. Our method uses dictionary learning technique for generating a high resolution image from a single input LR image. We train two joint dictionaries, one with OCT images and the second with multiple different natural images, and compare the results with previous SR technique. Proposed method for both dictionaries produces HR images which are comparatively superior in quality with the other proposed method of SR. Proposed technique is very effective for noisy OCT images and produces up-sampled and enhanced OCT images.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The tools developed by the School of geostatistical have many applications for image segmentation . First, it is very suited to the analysis of natural images eg from remote sensing images and medical images. secondly, they are less expensive in time calculation, as can the methods, from Fourier analysis or matrices coocurrences. We offer reviews of various works of authors to segment natural textures.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The finite mixture model based on the Gaussian distribution is a flexible and powerful tool to address image segmentation. However, in the case of ultrasound images, the intensity distributions are non-symmetric whereas the Gaussian distribution is symmetric. In this study, a new finite bounded Rayleigh distribution is proposed. One advantage of the proposed model is that Rayleigh distribution is non-symmetric which has ability to fit the shape of medical ultrasound data. Another advantage is that each component of the proposed model is suitable for the ultrasound image segmentation. We also apply the bounded Rayleigh mixture model in order to improve the accuracy and to reduce the computational time. Experiments show that the proposed model outperforms the state-of-art methods on time consumption and accuracy.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Mammography is currently the most effective imaging modality used by radiologists for the screening of breast cancer. Segmentation is one of the key steps in the process of developing anatomical models for calculation of safe medical dose of radiation. This paper explores the potential of the statistical region merging segmentation technique for Breast segmentation in digital mammograms. First, the mammograms are pre-processing for regions enhancement, then the enhanced images are segmented using SRM with multi scales, finally these segmentations are combined for region of interest (ROI) separation and edge detection. The proposed algorithm uses multi-scales region segmentation in order to: separate breast region from background region, region edge detection and ROIs separation. The experiments are performed using a data set of mammograms from different patients, demonstrating the validity of the proposed criterion. Results show that, the statistical region merging segmentation algorithm actually can work on the segmentation of medical image and more accurate than another methods. And the outcome shows that the technique has a great potential to become a method of choice for segmentation of mammograms.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
An Omega-3 chicken egg is a chicken egg produced through food engineering technology. It is produced by hen fed with high omega-3 fatty acids. So, it has fifteen times nutrient content of omega-3 higher than Leghorn’s. Visually, its shell has the same shape and colour as Leghorn’s. Each egg can be distinguished by breaking the egg’s shell and testing the egg yolk’s nutrient content in a laboratory. But, those methods were proven not effective and efficient. Observing this problem, the purpose of this research is to make an application to detect the type of omega-3 chicken egg by using a mobile-based computer vision. This application was built in OpenCV computer vision library to support Android Operating System. This experiment required some chicken egg images taken using an egg candling box. We used 60 omega-3 chicken and Leghorn eggs as samples. Then, using an Android smartphone, image acquisition of the egg was obtained. After that, we applied several steps using image processing methods such as Grab Cut, convert RGB image to eight bit grayscale, median filter, P-Tile segmentation, and morphology technique in this research. The next steps were feature extraction which was used to extract feature values via mean, variance, skewness, and kurtosis from each image. Finally, using digital image measurement, some chicken egg images were classified. The result showed that omega-3 chicken egg and Leghorn egg had different values. This system is able to provide accurate reading around of 91%.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Postsurgical wound care has a great impact on patients' prognosis. It often takes few days, even few weeks, for the wound to stabilize, which incurs a great cost of health care and nursing resources. To assess the wound condition and diagnosis, it is important to segment out the wound region for further analysis. However, the scenario of this strategy often consists of complicated background and noise. In this study, we propose a wound segmentation algorithm based on Canny edge detector and genetic algorithm with an unsupervised evaluation function. The results were evaluated by the 112 clinical images, and 94.3% of images were correctly segmented. The judgment was based on the evaluation of experimented medical doctors. This capability to extract complete wound regions, makes it possible to conduct further image analysis such as intelligent recovery evaluation and automatic infection requirements.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, a nonlinear least square twin support vector machine (NLSTSVM) with the integration of active contour model (ACM) is proposed for noisy image segmentation. Efforts have been made to seek the kernel-generated surfaces instead of hyper-planes for the pixels belonging to the foreground and background, respectively, using the kernel trick to enhance the performance. The concurrent self organizing maps (SOMs) are applied to approximate the intensity distributions in a supervised way, so as to establish the original training sets for the NLSTSVM. Further, the two sets are updated by adding the global region average intensities at each iteration. Moreover, a local variable regional term rather than edge stop function is adopted in the energy function to ameliorate the noise robustness. Experiment results demonstrate that our model holds the higher segmentation accuracy and more noise robustness.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Automatic segmentation of Left Ventricle (LV) is an essential task in the field of computer-aided analysis of cardiac function. In this paper, a simplified pulse coupled neural network (SPCNN) based approach is proposed to segment LV endocardium automatically. Different from the traditional image-driven methods, the SPCNN based approach is independent of the image gray distribution models, which makes it more stable. Firstly, the temporal and spatial characteristics of the cardiac magnetic resonance image are used to extract a region of interest and to locate LV cavity. Then, SPCNN model is iteratively applied with an increasing parameter to segment an optimal cavity. Finally, the endocardium is delineated via several post-processing operations. Quantitative evaluation is performed on the public database provided by MICCAI 2009. Over all studies, all slices, and two phases (end-diastole and end-systole), the average percentage of good contours is 91.02%, the average perpendicular distance is 2.24 mm and the overlapping dice metric is 0.86.These results indicate that the proposed approach possesses high precision and good competitiveness.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Since handwritten text lines are generally skewed and not obviously separated, text line segmentation of handwritten document images is still a challenging problem. In this paper, we propose a novel text line segmentation algorithm based on the spectral clustering. Given a handwritten document image, we convert it to a binary image first, and then compute the adjacent matrix of the pixel points. We apply spectral clustering on this similarity metric and use the orthogonal kmeans clustering algorithm to group the text lines. Experiments on Chinese handwritten documents database (HIT-MW) demonstrate the effectiveness of the proposed method.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper proposes a layered approach to improve the embedding capacity of the existing pixel-value differencing (PVD) methods for image steganography. Specifically, one of the PVD methods is applied to embed a secret information into a cover image and the resulting image, called stego-image, is used to embed additional secret information by the same or another PVD method. This results in a double-layered stego-image. Then, another PVD method can be applied to the double-layered stego-image, resulting in a triple-layered stego-image. Likewise, multi-layered stego-images can be obtained. To successfully recover the secret information hidden in each layer, the embedding process is carefully designed. In the experiment, the proposed layered PVD method proved to be effective.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this work, a fingerprint spoof detection method using an extended feature, namely Wavelet-based Local Binary Pattern (Wavelet-LBP) is introduced. Conventional wavelet-based methods calculate wavelet energy of sub-band images as the feature for discrimination while we propose to use Local Binary Pattern (LBP) operation to capture the local appearance of the sub-band images instead. The fingerprint image is firstly decomposed by two-dimensional discrete wavelet transform (2D-DWT), and then LBP is applied on the derived wavelet sub-band images. Furthermore, the extracted features are used to train Support Vector Machine (SVM) classifier to create the model for classifying the fingerprint images into genuine and spoof. Experiments that has been done on Fingerprint Liveness Detection Competition (LivDet) datasets show the improvement of the fingerprint spoof detection by using the proposed feature.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Lens flare effects in various photo and camera software nowadays can partially or fully damage the watermark information within the watermarked image. We propose in this paper a spatial domain based image watermarking against lens flare effects. The watermark embedding is based on the modification of the saturation color component in HSV color space of a host image. For watermark extraction, a homomorphic filter is used to predict the original embedding component from the watermarked component, and the watermark is blindly recovered by differentiating both components. The watermarked image’s quality is evaluated by wPSNR, while the extracted watermark’s accuracy is evaluated by NC. The experimental results against various types of lens flare effects from both computer software and mobile application showed that our proposed method outperformed the previous methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Indoor space 3D visual reconstruction has many applications and, once done accurately, it enables people to conduct different indoor activities in an efficient manner. For example, an effective and efficient emergency rescue response can be accomplished in a fire disaster situation by using 3D visual information of a destroyed building. Therefore, an accurate Indoor Space 3D visual reconstruction system which can be operated in any given environment without GPS has been developed using a Human-Operated mobile cart equipped with a laser scanner, CCD camera, omnidirectional camera and a computer. By using the system, accurate indoor 3D Visual Data is reconstructed automatically. The obtained 3D data can be used for rescue operations, guiding blind or partially sighted persons and so forth.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In optical flow for motion allocation, the efficient result in Motion Vector (MV) is an important issue. Several noisy conditions may cause the unreliable result in optical flow algorithms. We discover that many classical optical flows algorithms perform better result under noisy condition when combined with modern optimized model. This paper introduces effective robust models of optical flow by using Robust high reliability spatial based optical flow algorithms using the adaptive Lorentzian norm influence function in computation on simple spatial temporal optical flows algorithm. Experiment on our proposed models confirm better noise tolerance in optical flow’s MV under noisy condition when they are applied over simple spatial temporal optical flow algorithms as a filtering model in simple frame-to-frame correlation technique. We illustrate the performance of our models by performing an experiment on several typical sequences with differences in movement speed of foreground and background where the experiment sequences are contaminated by the additive white Gaussian noise (AWGN) at different noise decibels (dB). This paper shows very high effectiveness of noise tolerance models that they are indicated by peak signal to noise ratio (PSNR).
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The high spatial resolution images are ultimately demanded due to the requirement of the advance digital signal processing (DSP) and digital image processing (DIP) in modern implementations thereby the image spatial enhancements, especially for an ultra-rate spatial enhanced rate, have been ultimately investigated in the DSP and DIP society in the last twenty five years. The ultra-rate spatial enhancement employed by MSRR with Huber ML (Maximum Likelihood) regularization technique and SSRR with Huber high-spectrum expectation is proposed for enhancing upto 16x spatial rate in this paper. Initially, the collection of low spatial resolution images with noise is processed by MSRR for attenuating the noise and enhancing the spatial resolution. Later, the enhanced image is processed by SSRR for calculating the high-spectrum information in order to reconstruct the extortionate spatial enhancement with 16x spatial enhanced rate. In the performance evaluation section, the simulated consequences of the proposed ultra-rate spatial enhancement are compared with other previous state-of-art (such as a bicubic interpolation technique, a classical MSRR and a classical SSRR) in both PSNR (Peak Signal to Noise Ratio) and virtual quality attitude. From the performance evaluation consequence of four noise types at many noise powers, the proposed ultra-rate spatial enhancement has a superior performance than other previous state-of-art.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Remote eye trackers with consumer price have been used for various applications on flat computer screen. On the other hand, 3D gaze tracking in physical environment has been useful for visualizing gaze behavior, robots controller, and assistive technology. Instead of using affordable remote eye trackers, 3D gaze tracking in physical environment has been performed using corporate-level head mounted eye trackers, limiting its practical usage to niche user. In this research, we propose a novel method to estimate 3D gaze using consumer-level remote eye tracker. We implement geometric approach to obtain 3D point of gaze from binocular lines-of-sight. Experimental results show that the proposed method yielded low errors of 3.47±3.02 cm, 3.02±1.34 cm, and 2.57±1.85 cm in X, Y , and Z dimensions, respectively. The proposed approach may be used as a starting point for designing interaction method in 3D physical environment.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we introduce an idea of content-adaptive sparse reconstruction to achieve optimized magnification quality for those down sampled video frames, to which two stages of pruning are applied to select the closest correlated images for construction of an over-complete dictionary and drive the sparse representation of its enlarged frame. In this way, not only the sampling and dictionary training process is accelerated and optimized in accordance with the input frame content, but also an add-on video compression codec can be further developed by applying such scheme as a preprocessor to any standard video compression algorithm. Our extensive experiments illustrate that (i) the proposed content-adaptive sparse reconstruction outperforms the existing benchmark in terms of super-resolution quality; (ii) When applied to H.264, one of the international video compression standards, the proposed add-on video codec can achieve three times more compression while maintaining competitive decoding quality.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper a sample-adaptive prediction technique is proposed to yield efficient coding performance in an intracoding for screen content video coding. The sample-based prediction is to reduce spatial redundancies in neighboring samples. To this aim, the proposed technique uses a weighted linear combination of neighboring samples and applies the robust optimization technique, namely, ridge estimation to derive the weights in a decoder side. The ridge estimation uses L2 norm based regularization term, and, thus the solution is more robust to high variance samples such as in sharp edges and high color contrasts exhibited in screen content videos. It is demonstrated with the experimental results that the proposed technique provides an improved coding gain as compared to the HEVC screen content video coding reference software.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
For Dynamic Vision Sensors (DVS), thermal noise and junction leakage current induced Background Activity (BA) is the major cause of the deterioration of images quality. Inspired by the smoothing filtering principle of horizontal cells in vertebrate retina, A DVS pixel with Neighbor Pixel Communication (NPC) filtering structure is proposed to solve this issue. The NPC structure is designed to judge the validity of pixel’s activity through the communication between its 4 adjacent pixels. The pixel’s outputs will be suppressed if its activities are determined not real. The proposed pixel’s area is 23.76×24.71μm2 and only 3ns output latency is introduced. In order to validate the effectiveness of the structure, a 5×5 pixel array has been implemented in SMIC 0.13μm CIS process. 3 test cases of array’s behavioral model show that the NPC-DVS have an ability of filtering the BA.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
As the population of the world increases, urbanization generates crowding situations which poses challenges to public safety and security. Manual analysis of crowded situations is a tedious job and usually prone to errors. In this paper, we propose a novel technique of crowd analysis, the aim of which is to detect different dominant motion patterns in real-time videos. A motion field is generated by computing the dense optical flow. The motion field is then divided into blocks. For each block, we adopt an Intra-clustering algorithm for detecting different flows within the block. Later on, we employ Inter-clustering for clustering the flow vectors among different blocks. We evaluate the performance of our approach on different real-time videos. The experimental results show that our proposed method is capable of detecting distinct motion patterns in crowded videos. Moreover, our algorithm outperforms state-of-the-art methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper we propose an on-line motion planning strategy for autonomous robots in dynamic and locally observable environments. In this approach, we first visually identify geometric shapes in the environment by filtering images. Then, an ART-2 network is used to establish the similarity between patterns. The proposed algorithm allows that a robot establish its relative location in the environment, and define its navigation path based on images of the environment and its similarity to reference images. This is an efficient and minimalist method that uses the similarity of landmark view patterns to navigate to the desired destination. Laboratory tests on real prototypes demonstrate the performance of the algorithm.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents a novel robust color image zero-watermarking scheme based on SVD and visual cryptography. We firstly generate the image feature from the SVD of the image blocks, and then employ the visual secret sharing scheme to construct ownership share from the watermark and the image feature. The low frequency component of one level discrete wavelet transform of the color image is partitioned into blocks. Then we propose to use the feature generated from the first singular value of the blocks to construct the master share. When ownership debate occurs, the ownership share is used to extract the watermark. Experimental results show the better performance of the proposed watermarking system in terms of robustness to various attacks, including noise, filtering, JPEG compression and so on, than other visual cryptography based color image watermarking algorithm.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we propose a depth estimation method for multi-view image sequence. To enhance the accuracy of dense matching and reduce the inaccurate matching which is produced by inaccurate feature description, we select multiple matching points to build candidate matching sets. Then we compute an optimal depth from a candidate matching set which satisfies multiple constraints (epipolar constraint, similarity constraint and depth consistency constraint). To further increase the accuracy of depth estimation, depth consistency constraint of neighbor pixels is used to filter the inaccurate matching. On this basis, in order to get more complete depth map, depth diffusion is performed by neighbor pixels’ depth consistency constraint. Through experiments on the benchmark datasets for multiple view stereo, we demonstrate the superiority of proposed method over the state-of-the-art method in terms of accuracy.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Video tracking is a main field of computer vision, and TLD algorithm plays a key role in long-term tracking. However, the original TLD ignores the color features of patch in detection, and tracks the common points from grid, then, the tracking accuracy is limited to both of them. This paper presents a novel TLD algorithm with Harris corner and color moment to overcome this drawback. Instead of tracking common points, we screen more important points utilizing Harris corner to reject a half patches, these points are better able to show the object’s textural features. In addition, the color moment classifier replaces patch variance to reduce the errors of detection. The classifier compares mine-dimensional color moment vectors so that it can keep the TLD’s stable speed. Experiment has proved that our TLD tracks a more reliable position and higher ability without affecting the speed.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
For more flexibility of environmental perception by artificial intelligence it is needed to exist the supporting software modules, which will be able to automate the creation of specific language syntax and to make a further analysis for relevant decisions based on semantic functions. According of our proposed approach, of which implementation it is possible to create the couples of formal rules of given sentences (in case of natural languages) or statements (in case of special languages) by helping of computer vision, speech recognition or editable text conversion system for further automatic improvement. In other words, we have developed an approach, by which it can be achieved to significantly improve the training process automation of artificial intelligence, which as a result will give us a higher level of self-developing skills independently from us (from users). At the base of our approach we have developed a software demo version, which includes the algorithm and software code for the entire above mentioned component's implementation (computer vision, speech recognition and editable text conversion system). The program has the ability to work in a multi - stream mode and simultaneously create a syntax based on receiving information from several sources.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We propose a measurement method of Tail Beat Frequency (TBF) and Coast Phase (CP) of fish swimming for isolated fish in a school of fish with visual tracking. For analysis of fish swimming behaviors, features that represent fish movements, e.g., TBF and CP, have been commonly used in the fields of biological and fisheries researches. We propose a measurement method for such features using particle filter and apply the method to a large school of fish in an aquarium. Experimental results show that the TBFs and the CPs are measured with our method accurately enough for further analysis of fish behaviors. The average errors of the TBFs was 0.126 (Hz) and the precision and recall of the classification for CP detection were 0.945 and 0.879 respectively.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we propose an approach for 3D gaze estimation under head pose variation using RGB-D camera. Our method uses a 3D eye model to determine the 3D optical axis and infer the 3D visual axis. For this, we estimate robustly user head pose parameters and eye pupil locations with an ensembles of randomized trees trained with an important annotated training sets. After projecting eye pupil locations in the sensor coordinate system using the sensor intrinsic parameters and a one-time simple calibration by gazing a known 3D target under different directions, the 3D eyeball centers are determined for a specific user for both eyes yielding the determination of the visual axis. Experimental results demonstrate that our method shows a good gaze estimation accuracy even if the environment is highly unconstrained namely large user-sensor distances (> 1m50) unlike state-of-the-art methods which deal with relatively small distances (<1m).
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Since correct critical points are crucial for most shape decomposition algorithms, a variety of part-related measures have been presented to detect these critical points. Among this, the electrical charge distribution on the shape (ECDS) and its variants have some distinguishing characteristics and advantages, such as invariance and smoothness. However, we find it is still challenging to obtain satisfactory critical points, especially in the flat area such as the tails and legs of shapes. In this paper, we propose a novel way to make ECDS exhibit low values at given critical points. That is to say, critical points from other part-related measures can be introduced in ECDS, which will highly improve its descriptive ability. To achieve it, we propose to add constraints to linear equations, meanwhile relax these constraints in an anisotropy heat diffusion manner. Furthermore, we put forward a novel approach to find the stable extreme points of the improved ECDS (IECDS), which usually corresponding to critical points. Finally, we conduct experiments on the shapes in the MPEG-7 dataset, demonstrating that our method can obtain more meaningful critical points than existing methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The present study illustrates the spatial-temporal dynamics of Land use/cover change in Astrakhan city, Russia. Landsat satellite imageries of three different time periods of 2000, 2007 and 2015 were acquired by earth explorer website and quantify the changes in the Astrakhan. In this study maximum-likelihood supervised classification along with post-classification change detection was applied to satellite images for 2000, 2007 and 2015 in order to map land use/cover changes. The land use/cover study was classified into five major class’s viz. agriculture, bare-land, settlements, vegetation and water body. The classification results were then further refined using ancillary data, visual interpretation and expert knowledge of the area along with GIS. After post-classification change detection a change image form the cross-tabulations were generated. The result shows extensive vegetation degradation and water logging in different parts of the study area.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents novel approach to drill condition assessment using deep learning. The assessment regarding level of the drill wear is done on the basis of the drilled hole images. Two states of the drill are taken into account: the sharp enough to continue production and worn out. The decision is taken on the basis of the shape of hole and also the level of hole shredding. In this way the drill condition is associated with the problem of image analysis and classification. Novel approach to this classification task in the form of deep learning has been applied in solving this problem. The important advantage of this method is great simplification of the recognition procedure, since any handy craft prepared features are not needed and the focus may be concentrated on the most interesting aspects of data mining and machine learning. The obtained results belong to the best in comparison to other approaches to the problem solution.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This study is about the determination of moisture content of milled rice using image processing technique and perceptron neural network algorithm. The algorithm involves several inputs that produces an output which is the moisture content of the milled rice. Several types of milled rice are used in this study, namely: Jasmine, Kokuyu, 5-Star, Ifugao, Malagkit, and NFA rice. The captured images are processed using MATLAB R2013a software. There is a USB dongle connected to the router which provided internet connection for online web access. The GizDuino IOT-644 is used for handling the temperature and humidity sensor, and for sending and receiving of data from computer to the cloud storage. The result is compared to the actual moisture content range using a moisture tester for milled rice. Based on results, this study provided accurate data in determining the moisture content of the milled rice.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper shows the design and building of a low cost 3D scanner, able to digitize solid objects through contactless data acquisition, using active object reflection. 3D scanners are used in different applications such as: science, engineering, entertainment, etc; these are classified in: contact scanners and contactless ones, where the last ones are often the most used but they are expensive. This low-cost prototype is done through a vertical scanning of the object using a fixed camera and a mobile horizontal laser light, which is deformed depending on the 3-dimensional surface of the solid. Using digital image processing an analysis of the deformation detected by the camera was done; it allows determining the 3D coordinates using triangulation. The obtained information is processed by a Matlab script, which gives to the user a point cloud corresponding to each horizontal scanning done. The obtained results show an acceptable quality and significant details of digitalized objects, making this prototype (built on LEGO Mindstorms NXT kit) a versatile and cheap tool, which can be used for many applications, mainly by engineering students.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In the field of the artificial societies, particularly those are based on memetics, imitative behavior is essential for the development of cultural evolution. Applying this concept for robotics, through imitative learning, a robot can acquire behavioral patterns from another robot. Assuming that the learning process must have an instructor and, at least, an apprentice, the fact to obtain a quantitative measurement for their behavioral similarity, would be potentially useful, especially in artificial social systems focused on cultural evolution. In this paper the motor behavior of both kinds of robots, for two simple tasks, is represented by 2D binary images, which are processed in order to measure their behavioral similarity. The results shown here were obtained comparing some similarity measurement methods for binary images.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Many ancient paintings, in particular frescoes, have some parts ruined by time and events. Sometimes one or more non-negligible regions are completely lost, leaving a blank that is called by restaurateurs a ‘lacuna’. The general restoration philosophy adopted in these cases is to paint the interior part of the lacuna with an achromatic or non-neutral uniform color carefully selected in order to minimize its overall perception. In this paper, we present a computational model, based on a well-established variational theory of color perception, that may facilitate the job of a restaurateur by providing both achromatic and non-neutral colors which minimize the local contrast with the surrounding parts of the fresco.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We propose a novel lightweight method for classifying scene images, which fits well on weak machines, mobiles, or embedded devices. Our feature representation technique, which we call SOGI or Spatial Oriented Gradient Indexing, requires a small amount both of computational time and space. We show that, by capturing the spatial co-occurrence of gradient pairs, we provide sufficient amount of information for scene classification task. Despise the simplicity, experimental result shows that our method can still be comparable to other complicated methods on their own datasets.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The paper aims to apply the methods of image processing which are widely used in Earth remote sensing for processing and visualization of images in nano-resolution because most of these images are currently analyzed only by an expert researcher without proper statistical background. Nano-resolution level may range from a resolution in picometres to the resolution of a light microscope that may be up to about 200 nanometers. Images in nano-resolution play an essential role in physics, medicine, and chemistry. Three case studies demonstrate different image visualization and image analysis approaches for different scales at the nano-resolution level. The results of case studies prove the suitability and applicability of Earth remote sensing methods for image visualization and processing for the nanoresolution level. It even opens new dimensions for spatial analysis at such an extreme spatial detail.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper proposes a visual estimator for temperature and oxygen content for closed loop control of carbonization furnace in the production of activated carbon. The carbonization process involves thermal decomposition of vegetal material in the absence of air; this requires rigorous sensing and control of these two variables. The system consists of two cameras, a thermographic camera to estimate the temperature, and a traditional digital camera to estimate the oxygen content. In both cases we use similarity measures between images to estimate the value of the variables into the furnace, estimation that is used to control the furnace flame. The algorithm is tested with reference photos taken at the production plant, and the experimental results prove the performance of the proposed technique.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A novel stroke simulation method of the Half-dry style of Chinese calligraphy based on the force feedback technology is proposed for the virtual painting. Firstly, according to the deformation of the brush when the force is exerted on it, the brush footprint between the brush and paper is calculated. The complete brush stroke is obtained by superimposing brush footprints along the painting direction, and the dynamic painting of the brush stroke is implemented. Then, we establish the half-dry texture databases and propose the concept of half-dry value by researching the main factors that affect the effects of the half-dry stroke. In the virtual painting, the half-dry texture is mapped into the stroke in real time according to the half-dry value and painting technique. A technique of texture blending based on the KM model is applied to avoid the seams while texture mapping. The proposed method has been successfully applied to the virtual painting system based on the force feedback technology. In this system, users can implement the painting in real time with a Phantom Desktop haptic device, which can effectively enhance reality to users.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Surface height map estimation is an important task in high-resolution 3D reconstruction. This task differs from general scene depth estimation in the fact that surface height maps contain more high frequency information or fine details. Existing methods based on radar or other equipments can be used for large-scale scene depth recovery, but might fail in small-scale surface height map estimation. Although some methods are available for surface height reconstruction based on multiple images, e.g. photometric stereo, height map estimation directly from a single image is still a challenging issue. In this paper, we present a novel method based on convolutional neural networks (CNNs) for estimating the height map from a single image, without any equipments or extra prior knowledge of the image contents. Experimental results based on procedural and real texture datasets show the proposed algorithm is effective and reliable.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
To reveal the differences in brain structures and morphological changes between the mild cognitive impairment (MCI) and the normal control (NC), analyze and predict the risk of MCI conversion. First, the baseline and 2-year longitudinal follow-up magnetic resonance (MR) images of 73 NC, 46 patients with stable MCI (sMCI) and 40 patients with converted MCI (cMCI) were selected. Second, the FreeSurfer was used to extract the cortical features, including the cortical thickness, surface area, gray matter volume and mean curvature. Third, the support vector machine-recursive feature elimination method (SVM-RFE) were adopted to determine salient features for effective discrimination. Finally, the distribution and importance of essential brain regions were described. The experimental results showed that the cortical thickness and gray matter volume exhibited prominent capability in discrimination, and surface area and mean curvature behaved relatively weak. Furthermore, the combination of different morphological features, especially the baseline combined with the longitudinal changes, can be used to evidently improve the performance of classification. In addition, brain regions with high weights predominately located in the temporal lobe and the frontal lobe, which were relative to emotional control and memory functions. It suggests that there were significant different patterns in the brain structure and changes between the compared group, which could not only be effectively applied for classification, but also be used to evaluate and predict the conversion of the patients with MCI.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
With the growing demand for more efficient wound care after surgery, there is a necessity to develop a machine learning based image analysis approach to reduce the burden for health care professionals. The aim of this study was to propose a novel approach to recognize wound infection on the postsurgical site. Firstly, we proposed an optimal clustering method based on unimodal-rosin threshold algorithm to extract the feature points from a potential wound area into clusters for regions of interest (ROI). Each ROI was regarded as a suture site of the wound area. The automatic infection interpretation based on the support vector machine is available to assist physicians doing decision-making in clinical practice. According to clinical physicians’ judgment criteria and the international guidelines for wound infection interpretation, we defined infection detector modules as the following: (1) Swelling Detector, (2) Blood Region Detector, (3) Infected Detector, and (4) Tissue Necrosis Detector. To validate the capability of the proposed system, a retrospective study using the confirmation wound pictures that were used for diagnosis by surgical physicians as the gold standard was conducted to verify the classification models. Currently, through cross validation of 42 wound images, our classifiers achieved 95.23% accuracy, 93.33% sensitivity, 100% specificity, and 100% positive predictive value. We believe this ability could help medical practitioners in decision making in clinical practice.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Cardiac Magnetic Resonance Image (CMRI) is a significant assistant for the cardiovascular disease clinical diagnosis. The segmentation of right ventricle (RV) is essential for cardiac function evaluation, especially for RV function measurement. Automatic RV segmentation is difficult due to the intensity inhomogeneity and the irregular shape. In this paper, we propose an automatic RV segmentation framework. Firstly, we use the anisotropic diffusion to filter the CMRI. And then, the endocardium is extracted by the simplified pulse coupled neural network (SPCNN) segmentation. At last, the morphologic processors are used to obtain the epicardium. The experiment results show that our method obtains a good performance for both the endocardium and the epicardium segmentation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Image denoising is one of the fundamental and essential tasks within image processing. In medical imaging, finding an effective algorithm that can remove random noise in MR images is important. This paper proposes an effective noise reduction method for brain magnetic resonance (MR) images. Our approach is based on the collateral filter which is a more powerful method than the bilateral filter in many cases. However, the computation of the collateral filter algorithm is quite time-consuming. To solve this problem, we improved the collateral filter algorithm with parallel computing using GPU. We adopted CUDA, an application programming interface for GPU by NVIDIA, to accelerate the computation. Our experimental evaluation on an Intel Xeon CPU E5-2620 v3 2.40GHz with a NVIDIA Tesla K40c GPU indicated that the proposed implementation runs dramatically faster than the traditional collateral filter. We believe that the proposed framework has established a general blueprint for achieving fast and robust filtering in a wide variety of medical image denoising applications.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In order to repair the boundary depressions caused by juxtapleural nodules and improve the lung segmentation accuracy, we propose a new boundary correction method for lung parenchyma. Firstly, the top-hat filter is used to enhance the image contrast; Secondly, we employ the Ostu algorithm for image binarization; Thirdly, the connected component labeling algorithm is utilized to remove the main trachea; Fourthly, the initial mask image is obtained by morphological region filling algorithm; Fifthly, the boundary tracing algorithm is applied to extract the initial lung contour; Afterwards, we design a sudden change degree algorithm to modify the initial lung contour; Finally, the complete lung parenchyma image is obtained. The novelty is that sudden change degree algorithm can detect the inflection points more accurately than other methods, which contributes to repairing lung contour efficiently. The experimental results show that the proposed method can incorporate the juxtapleural nodules into the lung parenchyma effectively, and the precision is increased by 6.46% and 2.72% respectively compared with the other two methods, providing favorable conditions for the accurate detection of pulmonary nodules and having important clinical value.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
It's very important to differentiate the temporal lobe epilepsy (TLE) patients from healthy people and localize the abnormal brain regions of the TLE patients. The cortical features and changes can reveal the unique anatomical patterns of brain regions from the structural MR images. In this study, structural MR images from 28 normal controls (NC), 18 left TLE (LTLE), and 21 right TLE (RTLE) were acquired, and four types of cortical feature, namely cortical thickness (CTh), cortical surface area (CSA), gray matter volume (GMV), and mean curvature (MCu), were explored for discriminative analysis. Three feature selection methods, the independent sample t-test filtering, the sparse-constrained dimensionality reduction model (SCDRM), and the support vector machine-recursive feature elimination (SVM-RFE), were investigated to extract dominant regions with significant differences among the compared groups for classification using the SVM classifier. The results showed that the SVM-REF achieved the highest performance (most classifications with more than 92% accuracy), followed by the SCDRM, and the t-test. Especially, the surface area and gray volume matter exhibited prominent discriminative ability, and the performance of the SVM was improved significantly when the four cortical features were combined. Additionally, the dominant regions with higher classification weights were mainly located in temporal and frontal lobe, including the inferior temporal, entorhinal cortex, fusiform, parahippocampal cortex, middle frontal and frontal pole. It was demonstrated that the cortical features provided effective information to determine the abnormal anatomical pattern and the proposed method has the potential to improve the clinical diagnosis of the TLE.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Tuberculosis is an infectious disease and becomes a major threat all over the world but still diagnosis of tuberculosis is a challenging task. In literature, chest radiographs are considered as most commonly used medical images in under developed countries for the diagnosis of TB. Different methods have been proposed but they are not helpful for radiologists due to cost and accuracy issues. Our paper presents a methodology in which different combinations of features are extracted based on intensities, shape and texture of chest radiograph and given to classifier for the detection of TB. The performance of our methodology is evaluated using publically available standard dataset Montgomery Country (MC) which contains 138 CXRs among which 80 CXRs are normal and 58 CXRs are abnormal including effusion and miliary patterns etc. The accuracy of 81.16% was achieved and the results show that proposed method have outperformed existing state of the art methods on MC dataset.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Breast cancer is the most common cancer among women. Micro-calcification cluster on X-ray mammogram is one of the most important abnormalities, and it is effective for early cancer detection. Surrounding Region Dependence Method (SRDM), a statistical texture analysis method is applied for detecting Regions of Interest (ROIs) containing microcalcifications. Inspired by the SRDM, we present a method that extract gray and other features which are effective to predict the positive and negative regions of micro-calcifications clusters in mammogram. By constructing a set of artificial images only containing micro-calcifications, we locate the suspicious pixels of calcifications of a SRDM matrix in original image map. Features are extracted based on these pixels for imbalance date and then the repeated random subsampling method and Random Forest (RF) classifier are used for classification. True Positive (TP) rate and False Positive (FP) can reflect how the result will be. The TP rate is 90% and FP rate is 88.8% when the threshold q is 10. We draw the Receiver Operating Characteristic (ROC) curve and the Area Under the ROC Curve (AUC) value reaches 0.9224. The experiment indicates that our method is effective. A novel regions of micro-calcifications clusters detection method is developed, which is based on new features for imbalance data in mammography, and it can be considered to help improving the accuracy of computer aided diagnosis breast cancer.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Poisson and Gaussian noises have been known to affect Computed Tomography (CT) image quality during reconstruction. Standard median (SM) Filter has been widely used to reduce the unwanted impulsive noises. However, it cannot perform satisfactorily once the noise density is high. Recursive median (RM) filter has also been proposed to optimize the denoising. On the other hand, the image quality is degraded. In this paper, we propose a hybrid recursive median (RGSM) filtering technique by using Gauss-Seidel Relaxation to enhance denoising and preserve image quality in RM filter. First, the SM filtering was performed, followed by Gauss-Seidel, and combined to generate secondary approximation solution. This scheme was iteratively done by applying the secondary approximation solution to the successive iterations. Progressive noise reduction was accomplished in every iterative stage. The last stage generated the final solution. Experiments on CT lung images show that the proposed technique has higher noise reduction improvements compared to the conventional RM filtering. The results have also confirmed better anatomical quality preservation. The proposed technique may improve lung nodules segmentation and characterization performance.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Clustering is commonly technique for image segmentation, however determining an appropriate number of clusters is still challenging. Due to nuclei variation of size and shape in breast cancer image, an automatic determining number of clusters for segmenting the nuclei breast cancer is proposed. The phase of nuclei segmentation in breast cancer image are nuclei detection, touched nuclei detection, and touched nuclei separation. We use the Gram-Schmidt for nuclei cell detection, the geometry feature for touched nuclei detection, and combining of watershed and spatial k-Means clustering for separating the touched nuclei in breast cancer image. The spatial k-Means clustering is employed for separating the touched nuclei, however automatically determine the number of clusters is difficult due to the variation of size and shape of single cell breast cancer. To overcome this problem, first we apply watershed algorithm to separate the touched nuclei and then we calculate the distance among centroids in order to solve the over-segmentation. We merge two centroids that have the distance below threshold. And the new of number centroid as input to segment the nuclei cell using spatial k- Means algorithm. Experiment show that, the proposed scheme can improve the accuracy of nuclei cell counting.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Functional connectivity network (FCN) is an effective tool in psychiatry disorders classification, and represents cross-correlation of the regional blood oxygenation level dependent signal. However, FCN is often incomplete for suffering from missing and spurious edges. To accurate classify psychiatry disorders and health control with the incomplete FCN, we first ‘repair’ the FCN with link prediction, and then exact the clustering coefficients as features to build a weak classifier for every FCN. Finally, we apply a boosting algorithm to combine these weak classifiers for improving classification accuracy. Our method tested by three datasets of psychiatry disorder, including Alzheimer’s Disease, Schizophrenia and Attention Deficit Hyperactivity Disorder. The experimental results show our method not only significantly improves the classification accuracy, but also efficiently reconstructs the incomplete FCN.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
3D motion capture is difficult when the capturing is performed in an outdoor environment without controlled surroundings. In this paper, we propose a new approach of using two ordinary cameras arranged in a special stereoscopic configuration and passive markers on a subject’s body to reconstruct the motion of the subject. Firstly for each frame of the video, an adaptive thresholding algorithm is applied for extracting the markers on the subject’s body. Once the markers are extracted, an algorithm for matching corresponding markers in each frame is applied. Zhang’s planar calibration method is used to calibrate the two cameras. As the cameras use the fisheye lens, they cannot be well estimated using a pinhole camera model which makes it difficult to estimate the depth information. In this work, to restore the 3D coordinates we use a unique calibration method for fisheye lenses. The accuracy of the 3D coordinate reconstruction is evaluated by comparing with results from a commercially available Vicon motion capture system.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we propose a simple method to perform foreground extraction for a moving RGBD camera. These cameras have now been available for quite some time. Their popularity is primarily due to their low cost and ease of availability. Although the field of foreground extraction or background subtraction has been explored by the computer vision researchers since a long time, the depth-based subtraction is relatively new and has not been extensively addressed as of yet. Most of the current methods make heavy use of geometric reconstruction, making the solutions quite restrictive. In this paper, we make a novel use RGB and RGBD data: from the RGB frame, we extract corner features (FAST) and then represent these features with the histogram of oriented gradients (HoG) descriptor. We train a non-linear SVM on these descriptors. During the test phase, we make used of the fact that the foreground object has distinct depth ordering with respect to the rest of the scene. That is, we use the positively classified FAST features on the test frame to initiate a region growing to obtain the accurate segmentation of the foreground object from just the RGBD data. We demonstrate the proposed method of a synthetic datasets, and demonstrate encouraging quantitative and qualitative results.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Recently, IoT technologies have been progressed and applications of maintenance area are expected. However, IoT maintenance applications are not spread in Japan yet because of one-off solution of sensing and analyzing for each case, high cost to collect sensing data and insufficient maintenance automation. This paper proposes a maintenance platform which analyzes sound data in edges, analyzes only anomaly data in cloud and orders maintenance automatically to resolve existing technology problems. We also implement a sample application and compare related work.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
GPR (Ground Penetrating Radar) point data plays an absolutely necessary role in the tunnel geological prediction. However, the research work on the GPR point data is very little and the results does not meet the actual requirements of the project. In this paper, a GPR point data interpretation model which is based on WD (Wigner distribution) and deep CNN (convolutional neural network) is proposed. Firstly, the GPR point data is transformed by WD to get the map of time-frequency joint distribution; Secondly, the joint distribution maps are classified by deep CNN. The approximate location of geological target is determined by observing the time frequency map in parallel; Finally, the GPR point data is interpreted according to the classification results and position information from the map. The simulation results show that classification accuracy of the test dataset (include 1200 GPR point data) is 91.83% at the 200 iteration. Our model has the advantages of high accuracy and fast training speed, and can provide a scientific basis for the development of tunnel construction and excavation plan.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Deep learning is currently the state of the art algorithm for image classification. The complexity of these feedforward neural networks have overcome a critical point, resulting algorithmic breakthroughs in various fields. On the other hand their complexity makes them executable in tasks, where High-throughput computing powers are available. The optimization of these networks -considering computational complexity and applicability on embedded systems- has not yet been studied and investigated in details. In this paper I show some examples how this algorithms can be optimized and accelerated on embedded systems.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.