The extraction of keywords and features is a fundamental problem in text data mining. Document processing
applications directly depend on the quality and speed of the identification of salient terms and phrases. Applications as
disparate as automatic document classification, information visualization, filtering and security policy enforcement all
rely on the quality of automatically extracted keywords.
Recently, a novel approach to rapid change detection in data streams and documents has been developed. It is based on
ideas from image processing and in particular on the Helmholtz Principle from the Gestalt Theory of human perception.
By modeling a document as a one-parameter family of graphs with its sentences or paragraphs defining the vertex set
and with edges defined by Helmholtz's principle, we demonstrated that for some range of the parameters, the resulting
graph becomes a small-world network.
In this article we investigate the natural orientation of edges in such small world networks. For two connected sentences,
we can say which one is the first and which one is the second, according to their position in a document. This will make
such a graph look like a small WWW-type network and PageRank type algorithms will produce interesting ranking of
nodes in such a document.
We present hp2.me, a URL shortener service for improving the mobile web consumption experience. Unlike
other such services, given a short URL, hp2.me returns an image rendered from the salient regions of a web
page. This approach to displaying web content improves mobile web reading experience through reduced
latency and improved clarity. It is faster to load a few large image files over a cellular data network than
many small files, and limited mobile screen real estate can be better utilized to display relevant content. The
hp2.me service is currently in external beta, and we present results to illustrate the advantages of this
We present HP SmartPrint, a novel web browser plug-in which automatically suggests print-worthy content within a web
page and provides an intuitive UI for users to make corrections to the initial suggestion, if needed. The resulting prints
contain only user desired content and excludes noise such as ads, thus increasing the desirability of the prints while
minimizing the cost. This solution provides a streamlined web printing experience and will be shipping with most HP
printers starting in 2011.
This paper describes the usage of images in tangible products as a function of its origin, coming from digital still
cameras (DSC) or mobile devices. It is also shown, that pictures from mobile devices are mainly used to
complete story telling in photo books, they are currently not a driver for generating this kind of high value
products. Images taken from mobile devices generate to a great extent only prints mainly ordered via kiosk
We present an algorithm for smart image fitting: changing the size of an image so that it may fit "naturally" within a
given frame. As the frame's dimensions will generally differ from that of the image, the algorithm preserves important
details in their original aspect ratio, while less important details undergo more substantial deformations. This problem is
useful for many commercial print applications. One example is the HP SmartStream Designer, which is a tool to create
variable and personalized content documents.
In this paper, we present an all new custom path to allow consumers to have full control to their photos and the
format of their books, while providing them with guidance to make their creation fast and easy. The users can
choose to fully automate the initial creation, and then customize every page. The system manage many design
themes along with numerous design elements, such as layouts, backgrounds, embellishments and pattern bands.
The users can also utilize photos from multiple sources including their computers, Shutterfly accounts, Shutterfly
Share sites and Facebook. The users can also use a photo as background, add, move and resize photos and text
- putting what they want where they want instead of being confined to templates. The new path allows users to
add embellishments anywhere in the book, and the high-performance platform can support up to 1,000 photos
per book and up to 25 pictures per page. The path offers either Smart Autofill or Storyboard features allowing
customers to populate their books with photos so they can add captions and customize the pages.
Automatically quantifying the aesthetic appeal of images is an interesting problem in computer science and image
processing. In this paper, we incorporate aesthetic properties and convert them into computable image features for
classifying photographs taken by amateur and professional photographers. In particular, color histograms, spatial edge
distribution, and repetition identification are used as features. Results of experiments on professional and amateur
photograph data sets confirm the discriminative power of these features.
Social media is becoming increasingly prevalent with the advent of web 2.0 technologies. Popular social media
websites, such as Twitter and Facebook, are attracting a gigantic number of online users to post and share information.
An interesting phenomenon under this trend involves that more and more users share their experiences
or issues with regard to a product, and then the product service agents use commercial social media listening and
engagement tools (e.g. Radian6, Sysomos, etc.) to response to users' complaints or issues and help them tackle
their problems. This is often called customer care in social media or social customer relationship management
(CRM). However, all these existing commercial social media tools only provide an aggregated level of trends,
patterns and sentiment analysis based on the keyword-centric brand relevant data, which have little insights for
answering one of the key questions in social CRM system: how effective is our social customer care engagement?
In this paper, we focus on addressing the problem of how to measure the effectiveness of engagement for service
agents in customer care. Traditional CRM effectiveness measurements are defined under the scenario of the call
center, where the effectiveness is mostly based on the duration time per call and/or number of answered calls
per day. Different from customer care in a call center, we can obtain detailed conversations between agents
and customers in social media, and therefore the effectiveness can be measured by analyzing the content of
conversations and the sentiment of customers.
Images are one of the key components of a social network. A storage for images needs to be highly scalable and provide redundancy, high availability and the ability to grow its size. Efficiency is also required so that disk stage and the need for processing power can be minimized.
Tuenti's image storage uses a Content Delivery Network (CDN) as a web cache that allows us to meet high throughput requirements. When an image is not cached in the CDN, it is requested from the Image Routing Layer (IRL), which is in charge of finding its physical location. If the IRL is not able to retrieve the image from one of the locations it can get it from the other copies available, preventing the CDN and the user from noticing the miss. If the requested size is not available in the storage, the IRL will automatically resize the best size available and serve it back. Expensive operations, such as finding the physical location or resizing, are only done when there is a cache miss on the CDN.
The physical storage is split in homogeneous buckets that are spread across the storage servers. The growth strategy is to add more Storage Servers and to rebalance buckets towards them. Rebalancing not only provides free space on full servers but also allows the upload bandwidth to increase because there will be fewer buckets per server, and so fewer uploads per server.
Human visual system has the property of perceiving the object color to remain constant regardless of the prevailing
illumination. However, digital cameras usually lack this capability, and the captured images are digitally corrected to
discount the color of the scene light based on the estimated illuminant. Illumination estimation might be erroneous in
some artificial or chromatic lighting conditions. A method was proposed to correct digital photos captured with a
smartphone camera using the smartphone owner's face as the reference. Taking the advantage of the latest smartphones
with two build-in cameras, we could use the front camera to capture the smartphone owner's face and compare with the
saved reference face image in order to estimate the scene illuminant. After that, we could properly adjust the capture
setting for the main camera in order to take a decent target image; or we could automatically correct the target image
based on the estimated illumination by comparing two face images. The method was implemented on the iOS mobile
platform. Experimental result shows that the adjusted images using the proposed method are generally more favorable
than the pictures taken directly by the default camera application.
XML is widely used in various document formats on the web. But it has caused negative impacts such as
expensive document distribution time over the web, and long content jumping and rendering delay, especially on
mobile devices. Hence we proposed a Schema-based efficient queryable XML compressor, called XTrim, which
significantly improves compression ratio by utilizing optimized information in XML Schema while supporting
efficient queries. Firstly, XTrim draws structure information from XML document and corresponding XML
Schema. Then a novel technique is used to transform the XML tree-like structure into a compact indexed
form to support efficient queries. At the same time, text values are obtained, and a language-based text trim
method (LTT) that facilitates language-specific text compressors is adopted to reduce the size of text values
in various languages. In LTT a word composition detection method is proposed to better process text in
non-Latin languages. To evaluate the performance of XTrim, we have implemented a compressor and query
engine prototype. Via extensive experiments, results show that XTrim outperforms XMill and existing queryable
alternatives in terms of compression ratio, as well as the query efficiency. By applying XTrim to documents, the
storage space can save up to 30% and the content jumping and rendering delay is reduced to less than 100ms
from 4 seconds.
The advent of viable long tail & self-publishing solutions (, ) has spawned new requirements for automatic layout
technologies. In most cases these attempt to lay out whole pages, spreads or documents based on complete content data.
In this paper we introduce a new approach to document layout based on the principle of interactive design reuse, in
which a new design is created from an existing high quality design via a sequence of simple steps to establish the final
content. Based on our experience building such a system we propose a method of building layout hierarchies and discuss
the implementation of editing operations appropriate to this new paradigm.
Automatic layout algorithms simplify the composition of image-rich documents, but they still require users to have
sufficient artistry to supply well cropped and composed imagery. Combining an automatic cropping technology with a
document layout system enables better results to be produced faster by less-skilled users. This paper reviews prior work
in automatic image cropping and automatic page layout and presents a case for a combined crop and layout technology.
We describe one such technology in a system for interactive publication design by amateur self-publishers and show that
providing an automatic cropping system with additional information about the layout context can enable it to generate a
more appropriate set of ranked crop options for a given image. Furthermore, we show that providing an automatic layout
system with sets of ranked crop options for images can enable it to compose more appropriate page layouts.
Applications that classify and search documents based on their visual appearance need to recognize what
document features are the most critical to human perception when humans compare the documents. This paper presents
the results of a psychophysical experiment where subjects were asked to group the documents based on their visual
similarity. Results from 15 subjects were saved into similarity matrices, and tested for inter-rater agreement. The
similarity matrix averaged across the subjects was analyzed using agglomerative hierarchical clustering to identify the
clusters. The humans' clustering was approximated with the weighted sum of four distance matrices that we calculated
based on four document features. We identified the relative importance of the document features using an optimization
method. Then, we tested the approximation using K-fold cross validation and the K-nearest neighbor algorithm. The
results of the testing confirm the effectiveness of our approach.
Managing large document databases has become an important task. Sorting documents with respect to their
visual similarity and layout features, and visualization of the whole document database is a desirable application.
A user may wish to search for documents in a database that are similar to a query in temrs of their stylistic
features, or he/she may want to browse the whole database. In these tasks, clustering similar documents and
organizing the document database with respect to the clusters is preferable to presenting documents in a random
order. In this paper, we propose organization of single-page documents in a 3-D hierarchical structure called a
similarity pyramid. The pyramid is constructed from a stack of document database embeddings on a 2-D surface
with the help of a nonlinear dimensionality reduction algorithm called Isomap. The mapping algorithm preserves
similarity distances between documents by mapping documents that are close to each other in a feature space to
points on low-dimensional surface that are close to each other. Higher levels of the pyramid consist of document
image icons that represent a large group of roughly similar documents, whereas lower levels contain document
image icons representing small groups of very similar documents. A user can browse the database by moving
along a certain level of a pyramid by moving between dierent levels
In this paper, we propose a system for automatic design of magazine covers that quantifies a number of concepts from
art and aesthetics. Our solution to automatic design of this type of media has been shaped by input from professional
designers, magazine art directors and editorial boards, and journalists. Consequently, a number of principles in design
and rules in designing magazine covers are delineated. Several techniques are derived and employed in order to quantify
and implement these principles and rules in the format of a software framework. At this stage, our framework divides the
task of design into three main modules: layout of magazine cover elements, choice of color for masthead and cover lines,
and typography of cover lines. Feedback from professional designers on our designs suggests that our results are
congruent with their intuition.
Smart TVs has been introduced. Second, applications running on mobile devices (so called "second-screen apps") have
significantly enriched TV watching experience. As an enabler of content-aware TVs and apps, automatic content
recognition (ACR) is attracting a lot of attention recently. This paper presents an overview of ACR in this context. It
attempts to answer a number of questions: Why do we need ACR for the next generation TV experience? What is the
relationship between ACR and existing technologies? What are the unique requirements and challenges on ACR in those
applications? What are the typical implementation architectures? It also describes the existing products in this space.
Marketing instruments with nested, short-form, symbol loaded content need to be studied differently. Image
classification in the Web2.0 world can dynamically use a configurable amount of internal and external data as well as
varying levels of crowd-sourcing. Our work is one such examination of how to construct a hybrid technique involving
learning and crowd-sourcing. Through a parameter called turkmix and a multitude of crowd-sourcing techniques
available we show that we can control the trend of metrics such as precision and recall on the hybrid categorizer.
Lately, image personalization is becoming an interesting topic. Images with variable elements such as text usually
appear much more appealing to the recipients. In this paper, we describe a method to pre-analyze the image
and automatically suggest to the user the most suitable regions within an image for text-based personalization.
The method is based on input gathered from experiments conducted with professional designers. It has been
observed that regions that are spatially smooth and regions with existing text (e.g. signage, banners, etc.) are the
best candidates for personalization. This gives rise to two sets of corresponding algorithms: one for identifying
smooth areas, and one for locating text regions. Furthermore, based on the smooth and text regions found in
the image, we derive an overall metric to rate the image in terms of its suitability for personalization (SFP).
A watermark embed scheme has been developed to insert a watermark with the maximum signal
strength for a user selectable visibility constraint. By altering the watermark strength and direction to
meet a visibility constraint, the maximum watermark signal for a particular image is inserted. The
method consists of iterative embed software and a full color human visibility model plus a watermark
signal strength metric.
The iterative approach is based on the intersections between hyper-planes, which represent visibility and
signal models, and the edges of a hyper-volume, which represent output device visibility and gamut
constraints. The signal metric is based on the specific watermark modulation and detection methods and
can be adapted to other modulation approaches. The visibility model takes into account the different
contrast sensitivity functions of the human eye to L, a and b, and masking due to image content.
This paper investigated the problem of orientation detection for document images with Chinese characters. These images
may be in four orientations: right side up, up-side down, 90° and 270° rotated counterclockwise. First, we presented the
structure of text-recognition-based orientation detection algorithm. Text line verification and orientation judgment
methods were mainly discussed, afterwards multiple experiments were carried. Distance-difference based text line
verification and confidence based text line verification were proposed and compared with methods without text line
verification. Then, a picture-based orientation detection framework was adopted for the situation where no text line was
detected. This high-level classification problem was solved by relatively low-level vision features including Color
Moments (CM) and Edge Direction Histogram (EDH), with distant-based classification scheme. Finally, confidencebased
classifier combination strategy was employed in order to make full use of the complementarity between different
features and classifiers. Experiments showed that both text line verification methods were able to improve the accuracy
of orientation detection, and picture-based orientation detection had a good performance for no-text image set.