Down syndrome is one of the most common genetic disorders caused by chromosome abnormalities in humans. Among other physical characteristics, certain facial features are typically associated in people with Down syndrome. We investigate the problem of Down syndrome detection from a collection of face images. As the main contribution, a compact geometric descriptor is used to extract facial features from the images. Experiments are conducted on an available dataset to demonstrate the performance of the proposed methodology.
Remote sensing technology has applications in various knowledge domains, such as agriculture, meteorology, land use, environmental monitoring, military surveillance, and mineral exploration. The increasing advances in image acquisition techniques have allowed the generation of large volumes of data at high spectral resolution with several spectral bands representing images collected simultaneously. We propose and evaluate a supervised classification method composed of three stages. Initially, hyperspectral values and entropy information are employed by support vector machines to produce an initial classification. Then, the K-nearest neighbor technique searches for pixels with high probability of being correctly classified. Finally, minimum spanning forests are applied to these pixels to reclassify the image taking spatial restrictions into consideration. Experiments on several hyperspectral images are conducted to show the effectiveness of the proposed method.
Video analysis technology has become less expensive and more powerful in terms of storage resources and resolution capacity, promoting progress in a wide range of applications. Video-based human action detection has been used for several tasks in surveillance environments, such as forensic investigation, patient monitoring, medical training, accident prevention, and traffic monitoring, among others. We present a method for action identification based on adaptive training of a multilayer descriptor applied to a single classifier. Cumulative motion shapes (CMSs) are extracted according to the number of frames present in the video. Each CMS is employed as a self-sufficient layer in the training stage but belongs to the same descriptor. A robust classification is achieved through individual responses of classifiers for each layer, and the dominant result is used as a final outcome. Experiments are conducted on five public datasets (Weizmann, KTH, MuHAVi, IXMAS, and URADL) to demonstrate the effectiveness of the method in terms of accuracy in real time.
Facial expressions are an important demonstration of humanity’s humors and emotions. Algorithms capable of recognizing facial expressions and associating them with emotions were developed and employed to compare the expressions that different cultural groups use to show their emotions. Static pictures of predominantly occidental and oriental subjects from public datasets were used to train machine learning algorithms, whereas local binary patterns, histogram of oriented gradients (HOGs), and Gabor filters were employed to describe the facial expressions for six different basic emotions. The most consistent combination, formed by the association of HOG filter and support vector machines, was then used to classify the other cultural group: there was a strong drop in accuracy, meaning that the subtle differences of facial expressions of each culture affected the classifier performance. Finally, a classifier was trained with images from both occidental and oriental subjects and its accuracy was higher on multicultural data, evidencing the need of a multicultural training set to build an efficient classifier.
Robust local descriptors usually consist of high-dimensional feature vectors to describe distinctive characteristics of images. The high dimensionality of a feature vector incurs considerable costs in terms of computational time and storage. It also results in the curse of dimensionality that affects the performance of several tasks that use feature vectors, such as matching, retrieval, and classification of images. To address these problems, it is possible to employ some dimensionality reduction techniques, leading frequently to information lost and, consequently, accuracy reduction. This work aims at applying linear dimensionality reduction to the scale invariant feature transformation and speeded up robust feature descriptors. The objective is to demonstrate that even risking the decrease of the accuracy of the feature vectors, it results in a satisfactory trade-off between computational time and storage requirements. We perform linear dimensionality reduction through random projections, principal component analysis, linear discriminant analysis, and partial least squares in order to create lower dimensional feature vectors. These new reduced descriptors lead us to less computational time and memory storage requirements, even improving accuracy in some cases. We evaluate reduced feature vectors in a matching application, as well as their distinctiveness in image retrieval. Finally, we assess the computational time and storage requirements by comparing the original and the reduced feature vectors.
Successful execution of tasks such as image classification, object detection and recognition, and scene classification depends on the definition of a set of features able to describe images effectively. Texture is among the features used by the human visual system. It provides information regarding spatial distribution, changes in brightness, and description regarding the structural arrangement of surfaces. However, although the visual human system is extremely accurate to recognize and describe textures, it is difficult to define a set of textural descriptors to be used in image analysis on different application domains. This work evaluates several texture descriptors and demonstrates that the combination of descriptors can improve the performance of texture classification.