Camera pose estimation remains a challenging task for augmented reality (AR) applications. Simultaneous localization and mapping (SLAM)-based methods are able to estimate the six degrees of freedom camera motion while constructing a map of an unknown environment. However, these methods do not provide any reference for where to insert virtual objects since they do not have any information about scene structure and may fail in cases of occlusion of three-dimensional (3-D) map points or dynamic objects. This paper presents a real-time monocular piece wise planar SLAM method using the planar scene assumption. Using planar structures in the mapping process allows rendering virtual objects in a meaningful way on the one hand and improving the precision of the camera pose and the quality of 3-D reconstruction of the environment by adding constraints on 3-D points and poses in the optimization process on the other hand. We proposed to benefit from the 3-D planes rigidity motion in the tracking process to enhance the system robustness in the case of dynamic scenes. Experimental results show that using a constrained planar scene improves our system accuracy and robustness compared with the classical SLAM systems.
This paper presents a new technique for detection of dental caries that is a bacterial disease that destroys the tooth structure. In our approach, we have achieved a new segmentation method that combines the advantages of fuzzy C mean algorithm and level set method. The results obtained by the FCM algorithm will be used by Level sets algorithm to reduce the influence of the noise effect on the working of each of these algorithms, to facilitate level sets manipulation and to lead to more robust segmentation. The sensitivity and specificity confirm the effectiveness of proposed method for caries detection.
Currently, there are several fall detection systems based on video analysis. However, these systems have not yet reached the desired level of appropriateness and robustness. To reduce the risk of falling in insecure environments, a new method is developed in this paper to detect and predict human fall detection. We adopt, in this approach, a Block Matching motion estimation algorithm based on acceleration and changes of the human body silhouette area, which are obtained from a single surveillance camera. It presents an algorithm to accelerate the fall detection system on based on a local adjustment of the velocity field.
In this paper, we proposed a deep self-organizing map model (Deep-SOMs) for automated features extracting and learning from big data streaming which we benefit from the framework Spark for real time streams and highly parallel data processing. The SOMs deep architecture is based on the notion of abstraction (patterns automatically extract from the raw data, from the less to more abstract). The proposed model consists of three hidden self-organizing layers, an input and an output layer. Each layer is made up of a multitude of SOMs, each map only focusing at local headmistress sub-region from the input image. Then, each layer trains the local information to generate more overall information in the higher layer. The proposed Deep-SOMs model is unique in terms of the layers architecture, the SOMs sampling method and learning. During the learning stage we use a set of unsupervised SOMs for feature extraction. We validate the effectiveness of our approach on large data sets such as Leukemia dataset and SRBCT. Results of comparison have shown that the Deep-SOMs model performs better than many existing algorithms for images classification.
Automatic identification of television programs in the TV stream is an important task for operating archives. This article proposes a new spatio-temporal approach to identify the programs in TV stream into two main steps: First, a reference catalogue for video features visual jingles built. We operate the features that characterize the instances of the same program type to identify the different types of programs in the flow of television. The role of video features is to represent the visual invariants for each visual jingle using appropriate automatic descriptors for each television program. On the other hand, programs in television streams are identified by examining the similarity of the video signal for visual grammars in the catalogue. The main idea of the identification process is to compare the visual similarity of the video signal features in the flow of television to the catalogue. After presenting the proposed approach, the paper overviews encouraging experimental results on several streams extracted from different channels and compounds of several programs.
Face detection has been one of the most studied topics in the computer vision literature due to its relevant role in applications such as video surveillance, human computer interface and face image database management. Here, we will present a face detection approach which contains two steps. The first step is training phase based on Adaboost algorithm. The second step is the detection phase. The proposed approach presents an enhancement of Viola and Jones’ algorithm by replacing Haar descriptors with Beta wavelet. The obtained results have proved an excellent performance of detection not only when a face is in front of the camera but also when it is oriented towards the right or the left. Moreover, thanks to the start period needed for the detection, our approach can be applied during a real time experience.
This work fits into the context of the interpretation of automatic gestures based on computer vision. The aim of our work is to transform a conventional screen in a surface that allows the user to use his hands as pointing devices. These can be summarized in three main steps. Hand detection in a video, monitoring detected hands and conversion paths made by the hands to computer commands. To realize this application, it is necessary to detect the hand to follow. A classification phase is essential, at the control part. For this reason, we resorted to the use of a neuro-fuzzy classifier for classification and a pattern matching method for detection.
Proc. SPIE. 9445, Seventh International Conference on Machine Vision (ICMV 2014)
KEYWORDS: Detection and tracking algorithms, Data modeling, Databases, Wavelets, Fast wavelet transforms, Speech recognition, Network architectures, Decision support systems, Fuzzy logic, Classification systems
This paper aims at developing a novel approach for speech recognition based on wavelet network learnt by fast wavelet transform (FWN) including a fuzzy decision support system (FDSS). Our contributions reside in, first, proposing a novel learning algorithm for speech recognition based on the fast wavelet transform (FWT) which has many advantages compared to other algorithms and in which major problems of the previous works to compute connection weights were solved. They were determined by a direct solution which requires computing matrix inversion, which may be intensive. However, the new algorithm was realized by the iterative application of FWT to compute connection weights. Second, proposing a new classification way for this speech recognition system. It operated a human reasoning mode employing a FDSS to compute similarity degrees between test and training signals. Extensive empirical experiments were conducted to compare the proposed approach with other approaches. Obtained results show that the new speech recognition system has a better performance than previously established ones.
This work is in the field of human-computer communication, namely in the field of gestural communication. The objective was to develop a system for gesture recognition. This system will be used to control a computer without a keyboard. The idea consists in using a visual panel printed on an ordinary paper to communicate with a computer.