Faces often appear very small and oriented in surveillance videos because of the need of wide fields of view and typically a large distance between the cameras and the scene. Both low resolution and side-view faces make tasks such as face recognition difficult. As a result, face hallucination or super-resolution techniques of face images are generally needed, which has become a thriving research field. However, most existing methods assume face images have been well aligned into some canonical form (i.e. frontal, symmetric). Therefore, face alignment, especially for low-resolution face images, is a key and first step to the success of many face applications. In this paper, we propose an auto alignment approach for face images at different resolution, which consist of two fundamental steps: 1) To find the locations of facial landmarks or feature points (i.e. eyes, nose, and etc.) even for very low resolution faces; 2) To estimate and correct head poses based on the landmark locations and a 3D reference face model. The effectiveness of this method is shown by the aligned face images and the improved face recognition score on released data sets.
In the design of a magazine cover, making a set of decisions regarding the color distribution of the cover image and the colors of other graphical and textual elements is considered to be the concept of color design. This concept addresses a number of subjective challenges, specifically how to determine a set of colors that is aesthetically pleasing yet also contributes to the functionality of the design, the legibility of textual elements, and the stylistic consistency of the class of magazine. Our solution to automatic color design includes the quantification of these challenges by deploying a number of well-known color theories. These color theories span both color harmony and color semantics. The former includes a set of geometric structures that suggest which colors are in harmony together. The latter suggests a higher level of abstraction. Color semantics means to bridge sets of color combinations with color mood descriptors. For automatic design, we aim to deploy these two viewpoints by applying geometric structures for the design of text color and color semantics for the selection of cover images.
Consumer photos are typically authored once, but need to be retargeted for reuse in various situations. These
include printing a photo on different size paper, changing the size and aspect ratio of an embedded photo to
accommodate the dynamic content layout of web pages or documents, adapting a large photo for browsing on
small displays such as mobile phone screens, and improving the aesthetic quality of a photo that was badly
composed at the capture time. In this paper, we propose a novel, effective, and comprehensive content-aware
automatic cropping (hereafter referred to as “autocrop”) method for consumer photos to achieve the above
purposes. Our autocrop method combines the state-of-the-art context-aware saliency detection algorithm, which
aims to infer the likely intent of the photographer, and the “branch-and-bound” efficient subwindow search
optimization technique, which seeks to locate the globally optimal cropping rectangle in a fast manner. Unlike
most current autocrop methods, which can only crop a photo into an arbitrary rectangle, our autocrop method
can automatically crop a photo into either a rectangle of arbitrary dimensions or a rectangle of the desired aspect
ratio specified by the user. The aggressiveness of the cropping operation may be either automatically determined
by the method or manually indicated by the user with ease. In addition, our autocrop method is extended to
support the cropping of a photo into non-rectangular shapes such as polygons of any number of sides. It may also
be potentially extended to return multiple cropping suggestions, which will enable the creation of new photos to
enrich the original photo collections. Our experimental results show that the proposed autocrop method in this
paper can generate high-quality crops for consumer photos of various types.
In this paper, we present improvements to image selection and image layout for automatic photobook generating algorithms. These improvements are designed to help the user easily create a photo album, which matches the user preferences and strengthens the aesthetic quality of the photobook. Image content, composition, and metadata are utilized to determine the set of images being selected, and to suggest the layout of each page.
Automated publishing requires large databases containing document page layout templates. The number of
layout templates that need to be created and stored grows exponentially with the complexity of the document
layouts. A better approach for automated publishing is to reuse layout templates of existing documents for
the generation of new documents. In this paper, we present an algorithm for template extraction from a docu-
ment page image. We use the cost-optimized segmentation algorithm (COS) to segment the image, and Voronoi
decomposition to cluster the text regions. Then, we create a block image where each block represents a homo-
geneous region of the document page. We construct a geometrical tree that describes the hierarchical structure
of the document page. We also implement a font recognition algorithm to analyze the font of each text region.
We present a detailed description of the algorithm and our preliminary results.
We present HP SmartPrint, a novel web browser plug-in which automatically suggests print-worthy content within a web
page and provides an intuitive UI for users to make corrections to the initial suggestion, if needed. The resulting prints
contain only user desired content and excludes noise such as ads, thus increasing the desirability of the prints while
minimizing the cost. This solution provides a streamlined web printing experience and will be shipping with most HP
printers starting in 2011.
Automatically quantifying the aesthetic appeal of images is an interesting problem in computer science and image
processing. In this paper, we incorporate aesthetic properties and convert them into computable image features for
classifying photographs taken by amateur and professional photographers. In particular, color histograms, spatial edge
distribution, and repetition identification are used as features. Results of experiments on professional and amateur
photograph data sets confirm the discriminative power of these features.
In this paper, we propose a system for automatic design of magazine covers that quantifies a number of concepts from
art and aesthetics. Our solution to automatic design of this type of media has been shaped by input from professional
designers, magazine art directors and editorial boards, and journalists. Consequently, a number of principles in design
and rules in designing magazine covers are delineated. Several techniques are derived and employed in order to quantify
and implement these principles and rules in the format of a software framework. At this stage, our framework divides the
task of design into three main modules: layout of magazine cover elements, choice of color for masthead and cover lines,
and typography of cover lines. Feedback from professional designers on our designs suggests that our results are
congruent with their intuition.
We present hp2.me, a URL shortener service for improving the mobile web consumption experience. Unlike
other such services, given a short URL, hp2.me returns an image rendered from the salient regions of a web
page. This approach to displaying web content improves mobile web reading experience through reduced
latency and improved clarity. It is faster to load a few large image files over a cellular data network than
many small files, and limited mobile screen real estate can be better utilized to display relevant content. The
hp2.me service is currently in external beta, and we present results to illustrate the advantages of this
Even though technology has allowed us to measure many different aspects of images, it is still a challenge to
objectively measure their aesthetic appeal. A more complex challenge is presented when an arrangement of
images is to be analyzed, such as in a photo-book page. Several approaches have been proposed to measure the
appeal of a document layout that, in general, make use of geometric features such as the position and size of a
single object relative to the overall layout. Fewer efforts have been made to include in a metric the influence of
the content and composition of images in the layout. Many of the aesthetic characteristics that graphic designers
and artists use in their daily work have been either left out of the analysis or only roughly approximated in an
effort to materialize the concepts.
Moreover, graphic design tools such as transparency and layering play an important role in the professional
creation of layouts for documents such as posters and flyers. The main goal of our study is to apply similar
techniques within an automated photo-layout generation tool. Among other design techniques, the tool makes
use of layering and transparency in the layout to produce a professional-looking arrangement of the pictures.
Two series of experiments with people from different levels of expertise with graphic design provided us with the
tools to make the results of our system more appealing. In this paper, we discuss the results of our experiments
in the context of distinct graphic design concepts.
3D Head models have many applications, such as virtual conference, 3D web game, and so on. The existing several web-based
face modeling solutions that can create a 3D face model from one or two user uploaded face images, are limited to
generating the 3D model of only face region. The accuracy of such reconstruction is very limited for side views, as well
as hair regions. The goal of our research is to develop a framework for reconstructing the realistic 3D human head based
on two approximate orthogonal views. Our framework takes two images, and goes through segmentation, feature points
detection, 3D bald head reconstruction, 3D hair reconstruction and texture mapping to create a 3D head model. The main
contribution of the paper is that the processing steps are applies to both the face region as well as the hair region.
We describe a cloud-based automated-publishing platform that allows third party developers to embed our software
components into their applications, enabling their users to rapidly create documents for interactive viewing, or
fulfillment via mail or retail printing. We also describe how applications built on this platform can integrate with a
variety of different consumer digital ecosystems, and how we will address the quality and scaling challenges.
Managing large document databases is an important task today. Being able to automatically com-
pare document layouts and classify and search documents with respect to their visual appearance
proves to be desirable in many applications. We measure single page documents' similarity with
respect to distance functions between three document components: background, text, and saliency.
Each document component is represented as a Gaussian mixture distribution; and distances between
dierent documents' components are calculated as probabilistic similarities between corresponding
distributions. The similarity measure between documents is represented as a weighted sum of the
components' distances. Using this document similarity measure, we propose a browsing mechanism
operating on a document dataset. For these purposes, we use a hierarchical browsing environment
which we call the document similarity pyramid. It allows the user to browse a large document dataset
and to search for documents in the dataset that are similar to the query. The user can browse the
dataset on dierent levels of the pyramid, and zoom into the documents that are of interest.
Businesses have traditionally relied on different types of media to communicate with existing and potential customers.
With the emergence of the Web, the relation between the use of print and electronic media has continually evolved. In
this paper, we investigate one possible scenario that combines the use of the Web and print. Specifically, we consider the
scenario where a small- or medium-sized business (SMB) has an existing web site from which they wish to pull content
to create a print piece. Our assumption is that the web site was developed by a professional designer, working in
conjunction with the business owner or marketing team, and that it contains a rich assembly of content that is presented
in an aesthetically pleasing manner. Our goal is to understand the process that a designer would follow to create an
effective and aesthetically pleasing print piece. We are particularly interested to understand the choices made by the
designer with respect to placement and size of the text and graphic elements on the page. Toward this end, we conducted
an experiment in which professional designers worked with SMBs to create print pieces from their respective web pages.
In this paper, we report our findings from this experiment, and examine the underlying conclusions regarding the
resulting document aesthetics in the context of the existing design, and engineering and computer science literatures that
address this topic
In this paper we propose a simple method to obtain a Cartesian color dither-screen from a given monochrome dither-screen. The monochrome dot placement pattern (e.g. cluster or scatter), as well as its frequency domain features are maintained, while optimizing for color quality. Color quality is measured against the Minimal Brightness Variation Criterion.
In this paper we introduce a class of linear filters called 'donut filters' for the design of halftone screens that enable robust printing with stochastic0 clustered dots. The donut filter approach is a simple, yet efficient method to produce pleasing stochastic clustered-dot halftone patterns (a.k.a AM-FM halftones) suitable for systems with poor isolated dot reproduction and/or significant dot-gain. The radial profile of a donut filter resembles the radial cross section of a donut shape, with low impulse response at the center that rises to a peak and drops off rapidly as the pixel distance from the center is increased. A simple extension for the joint design of any number of colorant screens is given. This extension makes use of several optimal linear filters that may be treated as a single donut multi-filter having matrix-valued coefficients. A key contribution of this paper is the design of the parametric donut filters to be used at each graylevel. We show that given a desired spatial pair-correlation profile (a.k.a. spatial halftone statistics), optimum donut filters may be generated, such that the donut filter based screen design produces patterns possessing the desired profile in the maximum-likelihood sense. In fact, 'optimal green-noise' halftone screens having the spatial statistics described by Lau, Arce and Gallagher may be produced as a special case of our design. We will also demonstrate donut filter designs that do not use an 'optimum green-noise' target profile in the design and yet produce excellent stochastic clustered-dot halftone screens.
When halftoning a color image for a bi-level color printer, one has to obtain the halftones of the cyan, magenta, and yellow planes if the printer is a three color device. For a four-color printer, one has to also obtain the halftone of the black plane. Suppose a source color images is represented by the red, green, and blue components. The simple way of halftoning a color image using a dither matrix is to halftone each color plane independently using the same matrix. This will result in halftone dots of different colors overlapping each other, thus increasing the graininess of an image. Simple schemes such as shifting the matrices have been proposed in the past, but they usually reduce the dot overlap at the cost of increasing other artifacts, such as fuzziness. We propose an algorithm to jointly design a set of dither matrices such that the overall graininess is minimized. We use the direct binary search (DBS) algorithm to design a dither matrix for each of the primary colors of a printer, cyan, magenta, yellow, and black. A color fluctuation function is defined for the halftone patterns of a set of constant tone color patches in a uniform color space such as CIEL*a*b*. The color fluctuation function is then minimized on a level-by-level basis using swap operations. Efficient evaluation of the color fluctuation function allows the optimization to converge at a reasonable speed. We show that we are able to achieve halftone image quality comparable to that of the direct binary search (DBS) algorithm at a significantly lower computational cost. Because the dither matrices are pre-computed, efficient implementation in either hardware or software is possible.
The perceived quality of a printed image depends on the halftone algorithm and the printing process. This paper proposes a new method of analyzing halftone image quality in the frequency domain based on a human vision model. First, the Fourier transform characteristics of a dithered image are reviewed. Several commonly used dither algorithms, including clustered-dot dither and dispersed-dot dither, are evaluated based on their Fourier transform characteristics. Next, images halftoned with the dither algorithms and the Floyd-Steinberg error diffusion algorithm are compared in the frequency domain. Factors affecting printed image quality in a printing process are also discussed. Finally, a perception-based halftone image distortion measure is proposed. This measure reflects the quality of a halftone image printed on an ideal bi-level device and viewed at a particular distance. The halftone algorithms are ranked according to the proposed distortion measure. The effects of using human visual models with different peak sensitivity frequencies are examined.
The problem of phase unwrapping in elevation estimation using interferometric synthetic aperture radar images is approached through the detection of fringe line positions. The position of each fringe line is estimated by fitting the corresponding enhanced phase transition region with a series of basis functions, the coefficients of which are obtained through weighted least squares estimation. The algorithm is applied to a pair of SEASAT SAR images over mountainous terrain. Cosine functions and linear splines are used as the basis functions, and results show the proposed phase unwrapping algorithm can successfully eliminate global errors and reduce local errors.