This paper presents a system for automatic image selection for storytelling applications, like slideshows and photobooks,
where targeting a specific image count is usually of high importance. A versatile image collection representation
is introduced, which allows for automatic scalable selection in order to target a specific final image count, while
preserving a good coverage of the event in order to maintain the storytelling potential of the selection. A hierarchical
time clustering is presented, which is traversed at a specific hierarchy level in order to select images by alternating
among all time clusters, and selecting the most relevant images in that cluster. The relevance ordering we use is based on
a combination of features, namely, important people, smile detection, image appeal measures, and whether a nearduplicate
of the image has already been selected. Once this Hierarchical Scalable Representation has been created, it can
be reused to generate any target size selection. Two automatic image selection algorithms have been implemented, one
that selects images from clusters with high average image relevance more frequently, and another one that selects images
from larger clusters more frequently. The overall system has been used over the last year on several large image
collections; the resulting selection was presented to their owners in the form of photo-books in order to get feedback,
validating the presented approach.
Image appeal may be defined as the interest that a photograph generates when viewed by human observers,
incorporating subjective factors on top of the traditional objective quality measures. User studies were conducted in
order to identify the right features to use in an image appeal measure; these studies also revealed that a photograph may
be appealing even if only a region/area of the photograph is actually appealing. Due to the importance of faces regarding
image appeal, a detailed study of a set of face features is also presented, including face size, color and smile detection.
Extensive experimentation helped identify a good set of low level features, which are described in depth. These features
were optimized using extensive ground truth generated from sets of consumer photos covering all possible appeal levels,
by observers with a range of expertise in photography.
Digital publishing workflows usually have the need for composition and balance within the document, where certain
photographs will have to be chosen according to the overall layout of the document it is going to be placed in. i.e., the
composition within the photograph will have a relationship/balance with the rest of the document layout.
This paper presents a novel image retrieval method, in which the document where the image is to be inserted is used as
query. The algorithm calculates a balance measure between the document and each of the images in the collection,
retrieving the ones that have a higher balance score. The image visual weight map, used in the balance calculation, has
been successfully approximated by a new image quality map that takes into consideration sharpness, contrast and
To run a targeted campaign involves coordination and management across numerous organizations and complex process flows. Everything from market analytics on customer databases, acquiring content and images, composing the materials, meeting the sponsoring enterprise brand standards, driving through production and fulfillment, and evaluating results; all processes are currently performed by experienced highly trained staff. Presented is a developed solution that not only brings together technologies that automate each process, but also automates the entire flow so that a novice user could easily run a successful campaign from their desktop. This paper presents the technologies, structure, and process flows used to bring this system together. Highlighted will be how the complexity of running a targeted campaign is hidden from the user through technologies, all while providing the benefits of a professionally managed campaign.
This paper presents two complementary methods to help in the area of document creation where the document includes color templates (banners, clipart, logos, etc.) as well as photographs. The problems that are being addressed are: given a photograph that a document needs to be built around, extract a good palette of colors that harmonize with the selected photograph, which may be used to generate the color template; The images are segmented with a color based morphological approach, which identifies regions with a dominant color. Based on the morphology of such "color" regions, and the other color objects in the template the scheme will pick a set of possible color harmonies (affine, complementary, split complementary, triadic) for such color elements within the document based on the combined morphology image-document. If the image is changed in the future the color scheme could be changed automatically. Given a document color template, identify from a collection of images the best set that will harmonize with it. The document color template is analyzed in the same way as above, and the results are used to query an image database
in order to pick a set of images that will harmonize the best with such a color scheme.
Certain applications require the extraction of patches of color from an image, their size and location. These applications may be: color harmonization algorithms, non-photorealistic rendering, etc. These applications use not too big a palette of colors, and in both cases large areas of homogeneous color are favored along with high detail preserved in the smaller areas with a lot of color activity. The main problem this paper will tackle is to identify the underlying color in an image region, which will be referred to as its underlying color patch, and also try to protect as much as possible the high color activity detail areas. No perfect scene object segmentation is intended in this process, since different objects may be quantized to the same color, the result may be a merged color patch.
Home video collections constitute an important source of content to be experienced within the digital entertainment
context. To make such content easy to access and reuse, various video analysis technologies have been researched and
developed to extract video assets for management tasks, including video shot/scene detection, keyframe extraction, and
video skimming/summarization. However, one less addressed issue is to investigate how useful those assets are in helping
consumers managing their video collections and the usage pattern of the assets. In this paper, we present Personal Video
Manager, both as a home video management system and an explorative research platform to enable a systematic analysis
and understanding of consumers' demand on video assets and video processing technologies. For understanding consumer's
interest, PVM adopts database management technologies to model and archive how consumers identify video
assets and utilize them for management tasks. The PVM mining engine performs data mining on such archived data to
mine useful knowledge of consumer's preference on video assets and behavior on utilizing the assets. As revealed in the
experiment, consumer's interaction embeds rich information to be leveraged in developing more effective video analysis
We present methods and systems for authoring by linking---generating multimedia documents by creating richly typed links between component media assets. As an example we describe our Sticky Video functionality in the MEERCAT system for linking photo and video media. We do so within the framework of a hierarchy of possible link-based authoring systems from manual to programmatic link creation and document authoring. We discuss mechanisms for accurately situating links between rich media components and flexibly typing those links to allow both better human information browsing and searching and automatic authoring. We describe issues in realtime distributed authoring and the use of metadata channels. In particular we present the concept of authoring by meeting, the automatic creation of multimedia documents from business meetings.
Fractal image compression is a relatively new and very promising technique for still image compression. However, it is not widely applied due to its very time consuming encoding procedure. In this research, we focus on speeding up this procedure by introducing three schemes: dimensionality reduction, energy-based classification, and tree search. We have developed an algorithm that combines these three schemes together and achieves a speed-up factor of 175 at the expense of only 0.6 dB degradation in PSNR relative to the unmodified exhaustive search for a typical image encoded with 0.44 bpp.
Textured images are generally difficult to compress because they contain a large number of high frequency components which are difficult to capture with traditional compression schemes such as transform coding, especially at high compression ratios. Since many textures possess a high degree of self-similarity at different scales, the fractal compression technique can be applied to effectively encode such textured images by exploiting this self-similar property. The main drawback of fractal compression is that the fractal encoding procedure is very time consuming. In this research, we focus on the speed up of this procedure by introducing three schemes: dimensionality reduction, energy-based classification, and tree search. We have developed an algorithm that combines these three schemes together and achieves a speed-up factor of 177 at the expense of only 0.4 dB degradation in PSNR relative to the unmodified exhaustive search for a typical textured image encoded with 0.44 bpp.