We present a method for the automatic restoration of images subjected to the application of photographic filters, such as those made popular by photo-sharing services. The method uses a convolutional neural network (CNN) for the prediction of the coefficients of local polynomial transformations that are applied to the input image. The experiments we conducted on a subset of the Places-205 dataset show that the quality of the restoration performed by our method is clearly superior to that of traditional color balancing and restoration procedures, and to that of recent CNN architectures for image-to-image translation.
The analysis of color and texture has a long history in image analysis and computer vision. These two properties are often considered as independent, even though they are strongly related in images of natural objects and materials. Correlation between color and texture information is especially relevant in the case of variable illumination, a condition that has a crucial impact on the effectiveness of most visual descriptors. We propose an ensemble of hand-crafted image descriptors designed to capture different aspects of color textures. We show that the use of these descriptors in a multiple classifiers framework makes it possible to achieve a very high classification accuracy in classifying texture images acquired under different lighting conditions. A powerful alternative to hand-crafted descriptors is represented by features obtained with deep learning methods. We also show how the proposed combining strategy hand-crafted and convolutional neural networks features can be used together to further improve the classification accuracy. Experimental results on a food database (raw food texture) demonstrate the effectiveness of the proposed strategy.
In this paper we present a descriptor for texture classification based on the histogram of a local measure of the color contrast. The descriptor has been concatenated to several other color and intensity texture descriptors in the state of the art and has been experimented on three datasets. Results show, in nearly every case, a performance improvement with respect to results achieved by baseline methods thus demonstrating the effectiveness of the proposed texture features. The descriptor has also demonstrated to be robust with respect to global changes in lighting conditions.
We present here the results obtained by including a new image descriptor, that we called prosemantic feature
vector, within the framework of QuickLook2 image retrieval system. By coupling the prosemantic features and
the relevance feedback mechanism provided by QuickLook2, the user can move in a more rapid and precise way
through the feature space toward the intended goal. The prosemantic features are obtained by a two-step feature
extraction process. At the first step, low level features related to image structure and color distribution are
extracted from the images. At the second step, these features are used as input to a bank of classifiers, each
one trained to recognize a given semantic category, to produce score vectors. We evaluated the efficacy of the
prosemantic features under search tasks on a dataset provided by Fratelli Alinari Photo Archive.
In this work we present an automatic local color transfer method based on semantic image annotation. With
this annotation, images are segmented into homogeneous regions, assigned to seven different classes (vegetation,
snow, water, ground, street, and sand). Our method permits to automatically transfer the color distribution
from regions of the source and target images annotated with the same class (for example the class "sky"). The
amount of color transfer can be controlled by tuning a single parameter. Experimental results will show that
our local color transfer is usually more visually pleasant than a global approach.
In this work we present a system which visualizes the results obtained from image search engines in such a way
that users can conveniently browse the retrieved images. The way in which search results are presented allows
the user to grasp the composition of the set of images "at a glance". To do so, images are grouped and positioned
according to their distribution in a prosemantic feature space which encodes information about their content at
an abstraction level that can be placed between visual and semantic information. The compactness of the feature
space allows a fast analysis of the image distribution so that all the computation can be performed in real time.
We propose here a strategy for the automatic annotation of outdoor photographs. Images are segmented in
homogeneous regions which may be then assigned to seven different classes: sky, vegetation, snow, water, ground,
street, and sand. These categories allows for content-aware image processing strategies. Our annotation strategy
uses a normalized cut segmentation to identify the regions to be classified by a multi-class Support Vector
Machine. The strategy has been evaluated on a set of images taken from the LabelMe dataset.
We propose a method the for semi-automatic organization of photo albums. The method analyzes how different
users organize their own pictures. The goal is to help the user in dividing his pictures into groups characterized
by a similar semantic content. The method is semi-automatic: the user starts to assign labels to the pictures
and unlabeled pictures are tagged with proposed labels. The user can accept the recommendation or made a
correction. We use a suitable feature representation of the images to model the different classes that the users
have collected. Then, we look for correspondences between the criteria used by the different users which are
integrated using boosting. A quantitative evaluation of the proposed approach is obtained by simulating the
amount of user interaction needed to annotate the albums of a set of members of the flickr R(trademark) photo-sharing
Correct image orientation is often assumed by common imaging applications such as enhancement, browsing, and
retrieval. However, the information provided by camera metadata is often missing or incorrect. In these cases
manual correction is required, otherwise the images cannot be correctly processed and displayed. In this work
we propose a system which automatically detects the correct orientation of digital photographs. The system
exploits the information provided by a face detector and a set of low-level features related to distributions in the
image of color and edges. To prove the effectiveness of the proposed approach we evaluated it on two datasets
of consumer photographs.
No-reference quality metrics estimate the perceived quality exploiting only the image itself. Typically, noreference
metrics are designed to measure specific artifacts using a distortion model. Some psycho-visual experiments
have shown that the perception of distortions is influenced by the amount of details in the image's content,
suggesting the need for a "content weighting factor." This dependency is coherent with known masking effects
of the human visual system. In order to explore this phenomenon, we setup a series of experiments applying
regression trees to the problem of no-reference quality assessment. In particular, we have focused on the blocking
distortion of JPEG compressed images. Experimental results show that information about the visual content of
the image can be exploited to improve the estimation of the quality of JPEG compressed images.
Although traditional content-based retrieval systems have been successfully employed in many multimedia applications, the need for explicit association of higher concepts to images has been a pressing demand from users. Many research works have been conducted focusing on the reduction of the semantic gap between visual features and the semantics of the image content. In this paper we present a mechanism that combines broad high level concepts and low level visual features within the framework of the QuickLook content-based image retrieval system. This system also implements a relevance feedback algorithm to learn users' intended query from positive and negative image examples. With the relevance feedback mechanism, the retrieval process can be efficiently guided toward the semantic or pictorial contents of the images by providing the system with the suitable examples. The qualitative experiments performed on a database of more than 46,000 photos downloaded from the Web show that the combination of semantic and low level features coupled with a relevance feedback algorithm, effectively improve the accuracy of the image retrieval sessions.
KEYWORDS: Facial recognition systems, 3D acquisition, Detection and tracking algorithms, Image processing, Computing systems, System identification, C++, Mahalanobis distance, 3D image processing, RGB color model
This paper presents FaceLab, an innovative, open environment created to evaluate the performance of face recognition strategies. It simplifies, through an easy-to-use graphical interface, the basic steps involved in testing procedures such as data organization and preprocessing, definition and management of training and test sets, definition and execution of recognition strategies and automatic computation of performance measures. The user can extend the environment to include new algorithms, allowing the definition of innovative recognition strategies. The performance of these strategies can be automatically evaluated and compared by the tool, which computes several performance measures for both identity verification and identification scenarios.
This paper presents an innovative method that combines a feature-based approach with a holistic approach for tri-dimensional face detection and localization. Salient face features, such as the eyes and nose, are detected through an analysis of the curvature of the surface. In a second stage, each triplet consisting of a candidate nose and two candidate eyes is then processed by a PCA-based classifier, trained to discriminate between faces and non-faces. The method has been tested on about 150 3D faces acquired by a laser range scanner with good results.
The paper describes an innovative image annotation tool for classifying image regions in one of seven classes - sky, skin, vegetation, snow, water, ground, and buildings - or as unknown. This tool could be productively applied in the management of large image and video databases where a considerable volume of images/frames there must be automatically indexed. The annotation is performed by a classification system based on a multi-class Support Vector Machine. Experimental results on a test set of 200 images are reported and discussed.
The paper addresses the problem of distinguishing between pornographic and non-pornographic photographs, for the design of semantic filters for the web. Both, decision forests of trees built according to CART (Classification And Regression Trees) methodology and Support Vectors Machines (SVM), have been used to perform the classification. The photographs are described by a set of low-level features, features that can be automatically computed simply on gray-level and color representation of the image. The database used in our experiments contained 1500 photographs, 750 of which labeled as pornographic on the basis of the independent judgement of several viewers.
The paper addresses the problem of annotating photographs with broad semantic labels. To cope with the great variety of photos available on the WEB we have designed a hierarchical classification strategy which first classifies images as pornographic or not-pornographic. Not-pornographic images are then classified as indoor, outdoor, or close-up. On a database of over 9000 images, mostly downloaded from the web, our method achieves an average accuracy of close to 90%.