Image Processing and Computer Vision solutions have become commodities for software developers, thanks to the growing availability of Application Programming Interfaces (APIs) that encapsulate rich functionality, powered by advanced algorithms. Tech giants like Apple, Google, IBM, and Microsoft have made APIs and micro-services available in the cloud for the agile integration of machine learning and intelligent features onto everyday applications. As privacy and cyber welfare become prime concerns, special efforts have been devoted in the field of face processing and recognition. In this context, this paper provides a friendly, intuitive and fun to use mobile app that leverages the state-of-the-art APIs for face, age, gender and emotion recognition. The Face- It-Up app was implemented for the iOS platform and uses the Microsoft Cognitive Services APIs as a tool for human vision and face processing research. Experimental work on image compression, upside-down orientation, the Thatcher effect, negative inversion, high frequency, facial artifacts, caricatures and image degradation were performed to test the application. For this purpose, we used the Radboud and 10k US Adult Faces Databases. The app benefits from accessing high-resolution imagery and touch input from the smart-devices, allowing for a wide range of new experiments from the user perspective. Furthermore, our approach serves as a potential framework for new initiatives in image-based biometrics, the Internet of Things, and citizen science.
We present a comprehensive review of the state of the art in video browsing and retrieval systems, with special emphasis on interfaces and applications. There has been a significant increase in activity (e.g., storage, retrieval, and sharing) employing video data in the past decade, both for personal and professional use. The ever-growing amount of video content available for human consumption and the inherent characteristics of video data-which, if presented in its raw format, is rather unwieldy and costly-have become driving forces for the development of more effective solutions to present video contents and allow rich user interaction. As a result, there are many contemporary research efforts toward developing better video browsing solutions, which we summarize. We review more than 40 different video browsing and retrieval interfaces and classify them into three groups: applications that use video-player-like interaction, video retrieval applications, and browsing solutions based on video surrogates. For each category, we present a summary of existing work, highlight the technical aspects of each solution, and compare them against each other.
This paper presents a novel algorithm for extracting regions of interest (ROIs) from images in an unsupervised way. It relies on the information provided by two computational models of bottom-up visual attention, encoded in the form of the image's salient points-of-attention (POAs) and areas-of-attention (AOAs). The proposed method combines these POAs and AOAs to generate binary masks that correspond to the ROIs within the image. First, each AOA is binarized through an adapted relaxation algorithm where the histogram entropy of the AOA measurement is the stop criterion of the iterative process. The AOAs are also smoothed with a Gaussian pyramid followed by interpolation. Next, the binary representation of the AOAs, the smoothed version of the AOAs, and the POAs are converted in a mask that covers the salient ROIs of the image. The proposed ROI extraction algorithm does not impose any constraints on the number or distribution of salient regions in the input image. Qualitative and quantitative results show that the proposed method performs very well in a wide range of images, whether natural or man-made, from simple images of objects against a homogeneous background to complex cluttered scenes.