Cyber physical systems (CPS), such as smart buildings and data centers, are richly instrumented systems composed of tightly coupled computational and physical elements that generate large amounts of data. To explore CPS data and obtain actionable insights, we present a new approach called Radial Pixel Visualization (RPV); which uses multiple concentric rings to show the data in a compact circular layout of pixel cells, each ring containing the values for a specific variable over time and each pixel cell representing an individual data value at a specific time. RPV provides an effective visual representation of locality and periodicity of the high volume, multivariate data streams. RPVs may have an additional analysis ring for highlighting the results of correlation analysis or peak point detection. Our real-world applications demonstrate the effectiveness of this approach. The application examples show how RPV can help CPS administrators to identify periodic thermal hot spots, find root-causes of the cooling problems, understand building energy consumption, and optimize IT-services workloads.
Twitter currently receives over 190 million tweets (small text-based Web posts) and manufacturing companies receive over 10
thousand web product surveys a day, in which people share their thoughts regarding a wide range of products and their features. A
large number of tweets and customer surveys include opinions about products and services. However, with Twitter being a relatively
new phenomenon, these tweets are underutilized as a source for determining customer sentiments. To explore high-volume customer
feedback streams, we integrate three time series-based visual analysis techniques: (1) feature-based sentiment analysis that extracts,
measures, and maps customer feedback; (2) a novel idea of term associations that identify attributes, verbs, and adjectives frequently
occurring together; and (3) new pixel cell-based sentiment calendars, geo-temporal map visualizations and self-organizing maps to
identify co-occurring and influential opinions. We have combined these techniques into a well-fitted solution for an effective analysis
of large customer feedback streams such as for movie reviews (e.g., Kung-Fu Panda) or web surveys (buyers).
The detection of previously unknown, frequently occurring patterns in time series, often called motifs, has been
recognized as an important task. However, it is difficult to discover and visualize these motifs as their numbers
increase, especially in large multivariate time series. To find frequent motifs, we use several temporal data mining
and event encoding techniques to cluster and convert a multivariate time series to a sequence of events. Then we
quantify the efficiency of the discovered motifs by linking them with a performance metric. To visualize frequent
patterns in a large time series with potentially hundreds of nested motifs on a single display, we introduce three
novel visual analytics methods: (1) motif layout, using colored rectangles for visualizing the occurrences and
hierarchical relationships of motifs in a multivariate time series, (2) motif distortion, for enlarging or shrinking
motifs as appropriate for easy analysis and (3) motif merging, to combine a number of identical adjacent motif
instances without cluttering the display. Analysts can interactively optimize the degree of distortion and merging to
get the best possible view. A specific motif (e.g., the most efficient or least efficient motif) can be quickly detected
from a large time series for further investigation. We have applied these methods to two real-world data sets: data
center cooling and oil well production. The results provide important new insights into the recurring patterns.
The scatter plot is a well-known method of visualizing pairs of
two-dimensional continuous variables. Multidimensional
data can be depicted in a scatter plot matrix. They are intuitive and easy-to-use, but often have a high degree
of overlap which may occlude a significant portion of data. In this paper, we propose variable binned scatter plots to
allow the visualization of large amounts of data without overlapping. The basic idea is to use a non-uniform (variable)
binning of the x and y dimensions and plots all the data points that fall within each bin into corresponding squares.
Further, we map a third attribute to color for visualizing clusters. Analysts are able to interact with individual data points
for record level information. We have applied these techniques to solve real-world problems on credit card fraud and
data center energy consumption to visualize their data distribution and cause-effect among multiple attributes. A
comparison of our methods with two recent well-known variants of scatter plots is included.
Most data streams usually are multi-dimensional, high-speed, and contain massive volumes of continuous information. They are seen in daily applications, such as telephone calls, retail sales, data center performance, and oil production operations. Many analysts want insight into the behavior of this data. They want to catch the exceptions in flight to reveal the causes of the anomalies and to take immediate action. To guide the user in finding the anomalies in the large data stream quickly, we derive a new automated neighborhood threshold marking technique, called AnomalyMarker. This technique is built on cell-based data streams and user-defined thresholds. We extend the scope of the data points around the threshold to include the surrounding areas. The idea is to define a focus area (marked area) which enables users to (1) visually group the interesting data points related to the anomalies (i.e., problems that occur persistently or occasionally) for observing their behavior; (2) discover the factors related to the anomaly by visualizing the correlations between the problem attribute with the attributes of the nearby data items from the entire multi-dimensional data stream.
Mining results are quickly presented in graphical representations (i.e., tooltip) for the user to zoom into the problem
regions. Different algorithms are introduced which try to optimize the size and extent of the anomaly markers. We have
successfully applied this technique to detect data stream anomalies in large real-world enterprise server performance and data center energy management.
Time series data commonly occur when variables are monitored over time. Many real-world applications involve the comparison of long time series across multiple variables (multi-attributes). Often business people want to compare this year's monthly sales with last year's sales to make decisions. Data warehouse administrators (DBAs) want to know their daily data loading job performance. DBAs need to detect the outliers early enough to act upon them. In this paper, two new visual analytic techniques are introduced: The color cell-based Visual Time Series Line Charts and Maps highlight significant changes over time in a long time series data and the new Visual Content Query facilitates finding the contents and histories of interesting patterns and anomalies, which leads to root cause identification. We have applied both methods to two real-world applications to mine enterprise data warehouse and customer credit card fraud data to illustrate the wide applicability and usefulness of these techniques.
Charts and tables are commonly used to visually analyze data. These graphics are simple and easy to understand, but
charts show only highly aggregated data and present only a limited number of data values while tables often show
too many data values. As a consequence, these graphics may either lose or obscure important information, so
different techniques are required to monitor complex datasets. Users need more powerful visualization techniques to
digest and compare detailed multi-attribute data to analyze the health of their business. This paper proposes an
innovative solution based on the use of pixel-matrix displays to represent transaction-level information. With pixelmatrices,
users can visualize areas of importance at a glance, a capability not provided by common charting
techniques. We present our solutions to use colored pixel-matrices in (1) charts for visualizing data patterns and
discovering exceptions, (2) tables for visualizing correlations and finding root-causes, and (3) time series for
visualizing the evolution of long-running transactions. The solutions have been applied with success to product
sales, Internet network performance analysis, and service contract applications demonstrating the benefits of our
method over conventional graphics. The method is especially useful when detailed information is a key part of the
A common approach to analyze geo-related data is using bar charts or x-y plots. They are intuitive and easy to use. But important information often gets lost. In this paper, we introduce a new interactive visualization technique called Geo Pixel Bar Charts, which combines the advantages of Pixel Bar Charts and interactive maps. This technique allows analysts to visualize large amounts of spatial data without aggregation and shows the geographical regions corresponding to the spatial data attribute at the same time. In this paper, we apply Geo Pixel Bar Charts to visually mining sales transactions and Internet usage from different locations. Our experimental results show the effectiveness of this technique for providing data distribution and immediate identification of anomalies from the map.
Business Intelligence (BI) deals with transforming raw business data into valuable information for making decisions. The goal is to improve the operation and use of large-scale, complex information systems. A number of automated BI techniques are available. These methods, however, have to be supported by user interaction to make successful business decisions. In this paper, we present a new technique for interactive business intelligence based on visualization technology, called VisImpact. The basic idea of the VisImpact technique is to visually display the relationships between the important business operation parameters and the distribution of the process flow. We have applied VisImpact in the areas of business contract analysis, business operation analysis, and fraud analysis, to show the power of the VisImpact technique for finding process flows, patterns, and trends, and for a quick identification of exceptions (outliers). Our interactive VisImpact system provides the means for an instant drilldown to a transaction record level which allows observing the evolution of business dynamics.
Basic bar charts have been commonly available, but they only show highly aggregated data. Finding the valuable information hidden in the data is essential to the success of business. We describe a new visualization technique called pixel bar charts, which are derived from regular bar charts. The basic idea of a pixel bar chart is to present all data values directly instead of aggregating them into a few data values. Pixel bar charts provide data distribution and
exceptions besides aggregated data. The approach is to represent each data item (e.g. a business transaction) by a single pixel in the bar chart. The attribute of each data item is encoded into the pixel color and can be accessed and drilled down to the detail information as needed. Different color mappings are used to represent multiple attributes. This technique has been prototyped in three business service applications-Business Operation Analysis, Sales Analysis, and Service Level Agreement Analysis at Hewlett Packard Laboratories. Our applications show the wide applicability and usefulness of this new idea.
Web transactions are multidimensional and have a number of attributes: client, URL, response times, and numbers of messages. One of the key questions is how to simultaneously lay out in a graph the multiple relationships, such as the relationships between the web client response times and URLs in a web access application. In this paper, we describe a freeze technique to enhance a physics-based visualization system for web transactions. The idea is to freeze one set of objects before laying out the next set of objects during the construction of the graph. As a result, we substantially reduce the force computation time. This technique consists of three steps: automated classification, a freeze operation, and a graph layout. These three steps are iterated until the final graph is generated. This iterated-freeze technique has been prototyped in several e-service applications at Hewlett Packard Laboratories. It has been used to visually analyze large volumes of service and sales transactions at online web sites.
Proc. SPIE. 4665, Visualization and Data Analysis 2002
KEYWORDS: Optical spheres, Visual analytics, Visualization, Information technology, Associative arrays, Analytical research, Information theory, 3D visualizations, Information visualization, Data analysis
The real world data distribution is seldom uniform. Clutter and sparsity commonly occur in visualization. Often, clutter results in overplotting, in which certain data items are not visible because other data items occlude them. Sparsity results in the inefficient use of the available display space. Common mechanisms to overcome this include reducing the amount of information displayed or using multiple representations with a varying amount of detail. This paper describes out experiments on Non-Linear Visual Space Transformations (NLVST). NLVST encompasses several innovative techniques: (1) employing a histogram for calculating the density of data distribution; (2) mapping the raw data values to a non-linear scale for stretching a high-density area; (3) tightening the sparse area to save the display space; (4) employing different color ranges of values on a non-linear scale according to the local density. We have applied NLVST to several web applications: market basket analysis, transactions observation, and IT search behavior analysis.
This paper discusses the visualization of the relationships in e-commerce transactions. To date, many practical research projects have shown the usefulness of a physics-based mass- spring technique to layout data items with close relationships on a graph. We describe a market basket analysis visualization system using this technique. This system is described as the following: (1) integrates a physics-based engine into a visual data mining platform; (2) use a 3D spherical surface to visualize the cluster of related data items; and (3) for large volumes of transactions, uses hidden structures to unclutter the display. Several examples of market basket analysis are also provided.
To date, many web visualization applications have shown the usefulness of a hyperbolic tree. However, we have discovered that strict hierarchical tree structures are too limited. For many practical applications, we need to generalize a hyperbolic tree to a hyperbolic space. This approach results in massive cross-links in a highly connected graph that clutter the display. To resolve this problem, an invisible link technique is introduced. In this paper. we describe the navigation in a large hyperbolic space using invisible links in some detail. We have applied this invisible link method to three data mining visualization applications: e-business web navigation for URL visits, customer call center for question-answer service, and web site index creation.