This PDF file contains the front matter associated with SPIE Proceedings Volume 7868, including the Title Page, Copyright information, Table of Contents, Introduction, and the Conference Committee listing.
Driven by market forces and spanning the full spectrum of computational devices, computer architectures are changing
in ways that present tremendous opportunities and challenges for data analysis and visual analytic technologies.
Leadership-class high performance computing system will have as many as a million cores by 2020 and support 10
billion-way concurrency, while laptop computers are expected to have as many as 1,000 cores by 2015. At the same
time, data of all types are increasing exponentially and automated analytic methods are essential for all disciplines.
Many existing analytic technologies do not scale to make full use of current platforms and fewer still are likely to scale
to the systems that will be operational by the end of this decade. Furthermore, on the new architectures and for data at
extreme scales, validating the accuracy and effectiveness of analytic methods, including visual analysis, will be
This paper introduces a new method for creating an interactive sequence similarity map of all known influenza virus
protein sequences and integrating the map with existing general purpose analytical tools. The NCBI data model was
designed to provide a high degree of interconnectedness amongst data objects. Substantial and continuous increase in
data volume has led to a large and highly connected information space. Researchers seeking to explore this space are
challenged to identify a starting point. They often choose data that is popular in the literature. Reference in the literature
follow a power law distribution and popular data points may bias explorers toward paths that lead only to a dead-end of
what is already known. To help discover the unexpected we developed an interactive visual analytics system to map the
information space of influenza protein sequence data. The design is motivated by the needs of eScience researchers.
Proteins are biomolecules present in living organisms and essential for carrying out vital functions. Inherent to
their functioning is folding into different spatial conformations, and to understand these processes, it is crucial
to visually explore the structural changes. In recent years, significant advancements in experimental techniques
and novel algorithms for post-processing of protein data have routinely revealed static and dynamic structures of
increasing sizes. In turn, interactive visualization of the systems and their transitions became more challenging.
Therefore, much research for the efficient display of protein dynamics has been done, with the focus being
space filling models, but for the important class of abstract ribbon or cartoon representations, there exist only
few methods for an efficient rendering. Yet, these models are of high interest to scientists, as they provide a
compact and concise description of the structure elements along the protein main chain.
In this work, a method was developed to speed up ribbon and cartoon visualizations. Separating two phases
in the calculation of geometry allows to offload computational work from the CPU to the GPU. The first phase
consists of computing a smooth curve along the protein's main chain on the CPU. In the second phase, conducted
independently by the GPU, vertices along that curve are moved to set up the final geometrical representation of
We document an open-source toolbox for drawing large-scale undirected graphs. This toolbox is based on a previously
implemented closed-source algorithm known as VxOrd. Our toolbox, which we call OpenOrd, extends the capabilities of
VxOrd to large graph layout by incorporating edge-cutting, a multi-level approach, average-link clustering, and a parallel
implementation. At each level, vertices are grouped using force-directed layout and average-link clustering. The clustered
vertices are then re-drawn and the process is repeated. When a suitable drawing of the coarsened graph is obtained, the
algorithm is reversed to obtain a drawing of the original graph. This approach results in layouts of large graphs which
incorporate both local and global structure. A detailed description of the algorithm is provided in this paper. Examples
using datasets with over 600K nodes are given. Code is available at www.cs.sandia.gov/~smartin.
To make progress in understanding knot theory, we will need to interact with the projected representations of mathematical
knots which are of course continuous in 3D but significantly interrupted in the projective images. One way to achieve
such a goal would be to design an interactive system that allows us to sketch 2D knot diagrams by taking advantage of a
collision-sensing controller and explore their underlying smooth structures through a continuous motion. Recent advances
of interaction techniques have been made that allow progress to be made in this direction. Pseudo-haptics that simulates
haptic effects using pure visual feedback can be used to develop such an interactive system. This paper outlines one such
pseudo-haptic knot diagram interface. Our interface derives from the familiar pencil-and-paper process of drawing 2D knot
diagrams and provides haptic-like sensations to facilitate the creation and exploration of knot diagrams. A centerpiece of
the interaction model simulates a "physically" reactive mouse cursor, which is exploited to resolve the apparent conflict
between the continuous structure of the actual smooth knot and the visual discontinuities in the knot diagram representation.
Another value in exploiting pseudo-haptics is that an acceleration (or deceleration) of the mouse cursor (or surface
locator) can be used to indicate the slope of the curve (or surface) of whom the projective image is being explored. By
exploiting these additional visual cues, we proceed to a full-featured extension to a pseudo-haptic 4D visualization system
that simulates the continuous navigation on 4D objects and allows us to sense the bumps and holes in the fourth dimension.
Preliminary tests of the software show that main features of the interface overcome some expected perceptual limitations
in our interaction with 2D knot diagrams of 3D knots and 3D projective images of 4D mathematical objects.
The reconstruction of a continuous function from discrete data is a basic task in many applications such as the visualization
of 3D volumetric data sets. We use a local approximation method for quadratic C1 splines on uniform tetrahedral
partitions to achieve a globally smooth function. The spline is based on a truncated octahedral partition of the volumetric
domain, where each truncated octahedron is further split into a fixed number of disjunct tetrahedra. The Bernstein-Bézier
coefficients of the piecewise polynomials are directly determined by appropriate combinations of the data values in a local
neighborhood. As previously shown, the splines provide an approximation order two for smooth functions as well as
their derivatives. We present the first visualizations using these splines and show that they are well-suited for GPU-based,
interactive high-quality visualization of isosurfaces from discrete data.
Interaction in visualization is often complicated and tedious. Brushing data in a visualization such as parallel
coordinates is a central part of the data analysis process, and sets visualization apart from static charts. Modifying
a brush, or combining it with another one, usually requires a lot of effort and mode switches, though,
slowing down interaction and even discouraging more complex questions.
We propose the use of multi-touch interaction to provide fast and convenient interaction with parallel coordinates.
By using a multi-touch trackpad rather than the screen directly, the user's hands do not obscure the
visualization during interaction. Using one, two, three, or four fingers, the user can easily and quickly perform
complex selections. Being able to change the selections rapidly, the user can explore the data set more easily
and effectively, and can focus on the data rather than the interaction.
Mixture models are the term given to models that consist of a combination of independent functions creating
the distribution of points within a set. We present a framework for automatically discovering and evaluating
candidate models within unstructured data. Our abstraction of models enables us to seamlessly consider different
types of functions as equally possible candidates. Our framework does not require an estimate of the number of
underlying models, allows points to be probabilistically classified into multiple models or identified as outliers,
and includes a few parameters that an analyst (not typically an expert in statistical methods) may use to adjust
the output of the algorithm. We give results from our framework with synthetic data and classic data.
Height fields are an important modeling and visualization tool in many applications and their exploration requires their
display at interactive frame rates. This is hard to achieve even with high performance graphics computers due to their
inherent geometric complexity. Typical solutions consist of using polygonal approximations of the height field to reduce
the number of geometric primitives that need to be rendered. Starting from a rough approximation, a refinement process
is operated until a desired level of detail is reached. In this work, we present a novel efficient algorithm that starts with
an approximation that carries enough information about the height field so that only few refinement steps are needed to
achieve any desired level of detail. Our initial approximation is a simple triangulation whose nodes are the critical points
of the height field, that is the peaks, pits, and passes of the surface which give its overall shape. The extraction of critical
points of the surface, which is a discrete structure, is done using a newly designed algorithm based on discrete Morse
theory and computational homology algorithms. 1-3
Datasets over a spatial domain are common in a number of fields, often with multiple layers (or variables)
within data that must be understood together via spatial locality. Thus one area of long-standing interest is
increasing the number of variables encoded by properties of the visualization. A number of properties have
been demonstrated and/or proven successful with specific tasks or data, but there has been relatively little work
comparing the utility of diverse techniques for multi-layer visualization. As part of our efforts to evaluate the
applicability of such visualizations, we implemented five techniques which represent a broad range of existing
research (Color Blending, Oriented Slivers, Data-Driven Spots, Brush Strokes, and Stick Figures). Then we
conducted a user study wherein subjects were presented with composites of three, four, and five layers (variables)
using one of these methods and asked to perform a task common to our intended end users (GIS analysts). We
found that the Oriented Slivers and Data-Driven Spots performed the best, with Stick Figures yielding the lowest
accuracy. Through analyzing our data, we hope to gain insight into which techniques merit further exploration
and offer promise for visualization of data sets with ever-increasing size.
Chromatography is a technique used to separate and quantify the components in a complex chemical mixture. We have
created a 3D visualization system capable of comparing the chemical properties of chromatographic systems. The
visualization system combines scatter plots, parallel coordinates, and specialized glyphs to assist in the analysis of
chromatographic data and comparisons of multiple systems. Using this tool, numerous separation systems can be readily
compared simultaneously - greatly facilitating the ability to select systems that are likely to produce desired separations
during method development.
Interactive visualization of very large data sets remains a challenging problem to the visualization community.
One promising solution involves using adaptive resolution representations of the data. In this model, important
regions of data are identified using reconstructive error analysis and are shown in higher detail. During the
visualization, regions with higher error are rendered with high resolution data, while areas of low error are
rendered at a lower resolution. We have developed a new dynamic adaptive resolution rendering algorithm along
with software support libraries. These libraries are designed to extend the VisIt visualization environment by
adding support for adaptive resolution data. VisIt supports domain decomposition of data, which we use to
define our AR representation. We show that with this model, we achieve performance gains while maintaining
error tolerances specified by the scientist.
Visual data presentations require adaptation for appropriate display on a viewing device that is limited in re-
sources such as computing power, screen estate, and/or bandwidth. Due to the complexity of suitable adaptation,
the few proposed solutions available are either too resource-intensive or in
exible to be applied broadly. Eective
use and acceptance of data visualization on constrained viewing devices require adaptation approaches that are
tailored to the requirements of the user and the capabilities of the viewing device.
We propose a predictive device adaptation approach that takes advantage of progressive data renement. The
approach relies on hierarchical data structures that are created once and used multiple times. By incrementally
reconstructing the visual presentation on the client with increasing levels of detail and resource utilization, we can
determine when to truncate the renement of detail so as to use the resources of the device to their full capacities.
To determine when to nish the renement for a particular device, we introduce a prole-based strategy which
also considers user preferences. We discuss the whole adaptation process from the storage of the data into
a scalable structure to the presentation on the respective viewing device. This particular implementation is
shown for two common data visualization methods, and empirical results we obtained from our experiments are
presented and discussed.
In this work, we introduce EdgeMaps as a new method for integrating the visualization of explicit and implicit
data relations. Explicit relations are specific connections between entities already present in a given dataset, while
implicit relations are derived from multidimensional data based on shared properties and similarity measures.
Many datasets include both types of relations, which are often difficult to represent together in information
visualizations. Node-link diagrams typically focus on explicit data connections, while not incorporating implicit
similarities between entities. Multi-dimensional scaling considers similarities between items, however, explicit
links between nodes are not displayed. In contrast, EdgeMaps visualize both implicit and explicit relations by
combining and complementing spatialization and graph drawing techniques. As a case study for this approach
we chose a dataset of philosophers, their interests, influences, and birthdates. By introducing the limitation of
activating only one node at a time, interesting visual patterns emerge that resemble the aesthetics of fireworks
and waves. We argue that the interactive exploration of these patterns may allow the viewer to grasp the
structure of a graph better than complex node-link visualizations.
Visualizations can potentially misrepresent information if they ignore or hide the uncertainty that are usually
present in the data. While various techniques and tools exist for visualizing uncertainty in scientific visualizations,
there are very few tools that primarily focus on visualizing uncertainty in graphs or network data. With the
popularity of social networks and other data sets that are best represented by graphs, there is a pressing need
for visualization systems to show uncertainty that are present in the data. This paper focuses on visualizing a
particular type of uncertainty in graphs - we assume that nodes in a graph can have one or more attributes, and
each of these attributes may have an uncertainty associated with it. Unlike previous efforts in visualizing node
or edge uncertainty in graphs by changing the appearance of the nodes or edges, e.g. by blurring, the approach
in this paper is to use the spatial layout of the graph to represent the uncertainty information. We describe a
prototype tool that incorporates several uncertainty-to-spatial-layout mappings and describe a scenario showing
how it might be used for a visual analysis task.
Moment tensors derived from seismic measurements during earthquakes are related to stress tensors and keep
important information about surface displacement in the earth's mantle. We present methods facilitating an
interactive visualization of scattered moment data to support earthquake and displacement analysis. For this
goal, we combine and link visualizations of spatial location and orientation information derived from moment
tensor decompositions. Furthermore, we contribute new tensor glyphs highlighting the indefinite character of
moment tensors as well as novel tensor clustering and averaging techniques to aid interactive visual analysis and
ease the challenges of interpreting moment tensor data.
The detection of previously unknown, frequently occurring patterns in time series, often called motifs, has been
recognized as an important task. However, it is difficult to discover and visualize these motifs as their numbers
increase, especially in large multivariate time series. To find frequent motifs, we use several temporal data mining
and event encoding techniques to cluster and convert a multivariate time series to a sequence of events. Then we
quantify the efficiency of the discovered motifs by linking them with a performance metric. To visualize frequent
patterns in a large time series with potentially hundreds of nested motifs on a single display, we introduce three
novel visual analytics methods: (1) motif layout, using colored rectangles for visualizing the occurrences and
hierarchical relationships of motifs in a multivariate time series, (2) motif distortion, for enlarging or shrinking
motifs as appropriate for easy analysis and (3) motif merging, to combine a number of identical adjacent motif
instances without cluttering the display. Analysts can interactively optimize the degree of distortion and merging to
get the best possible view. A specific motif (e.g., the most efficient or least efficient motif) can be quickly detected
from a large time series for further investigation. We have applied these methods to two real-world data sets: data
center cooling and oil well production. The results provide important new insights into the recurring patterns.
Publisher’s Note: This paper, originally published on 24 January 2011, was replaced with a corrected/revised version on 9 April 2015. If you downloaded the original PDF but are unable to access the revision, please contact SPIE Digital Library Customer Service for assistance.
Business processes have tremendously changed the way large companies conduct their business: The integration
of information systems into the workflows of their employees ensures a high service level and thus high customer
satisfaction. One core aspect of business process engineering are events that steer the workflows and trigger
internal processes. Strict requirements on interval-scaled temporal patterns, which are common in time series,
are thereby released through the ordinal character of such events. It is this additional degree of freedom that
opens unexplored possibilities for visualizing event data.
In this paper, we present a flexible and novel system to find significant events, event clusters and event patterns.
Each event is represented as a small rectangle, which is colored according to categorical, ordinal or intervalscaled
metadata. Depending on the analysis task, different layout functions are used to highlight either the
ordinal character of the data or temporal correlations. The system has built-in features for ordering customers
or event groups according to the similarity of their event sequences, temporal gap alignment and stacking of
co-occurring events. Two characteristically different case studies dealing with business process events and news
articles demonstrate the capabilities of our system to explore event data.
Transfer functions have a crucial role in the understanding and visualization of 3D data. While research has
scrutinized the possible uses of one and multi-dimensional transfer functions in the spatial domain, to our
knowledge, no attempt has been done to explore transfer functions in the frequency domain. In this work we
propose transfer functions for the purpose of frequency analysis and visualization of 3D data. Frequency-based
transfer functions offer the possibility to discriminate signals, composed from different frequencies, to analyze
problems related to signal processing, and to help understanding the link between the modulation of specific
frequencies and their impact on the spatial domain. We demonstrate the strength of frequency-based transfer
functions by applying them to medical CT, ultrasound and MRI data, physics data as well as synthetic seismic
data. The interactive design of complex filters for feature enhancement can be a useful addition to conventional
The surveillance of large sea, air or land areas normally involves the analysis of large volumes of heterogeneous
data from multiple sources. Timely detection and identification of anomalous behavior or any threat activity
is an important objective for enabling homeland security. While it is worth acknowledging that many existing
mining applications support identification of anomalous behavior, autonomous anomaly detection systems for
area surveillance are rarely used in the real world. We argue that such capabilities and applications present two
critical challenges: (1) they need to provide adequate user support and (2) they need to involve the user in the
underlying detection process.
In order to encourage the use of anomaly detection capabilities in surveillance systems, this paper analyzes
the challenges that existing anomaly detection and behavioral analysis approaches present regarding their use
and maintenance by users. We analyze input parameters, detection process, model representation and outcomes.
We discuss the role of visualization and interaction in the anomaly detection process. Practical examples from
our current research within the maritime domain illustrate key aspects presented.
Cluster analysis is an important data mining technique for analyzing large amounts of data, reducing many
objects to a limited number of clusters. Cluster visualization techniques aim at supporting the user in better
understanding the characteristics and relationships among the found clusters. While promising approaches
to visual cluster analysis already exist, these usually fall short of incorporating the quality of the obtained
clustering results. However, due to the nature of the clustering process, quality plays an important aspect, as
for most practical data sets, typically many different clusterings are possible. Being aware of clustering quality
is important to judge the expressiveness of a given cluster visualization, or to adjust the clustering process with
refined parameters, among others.
In this work, we present an encompassing suite of visual tools for quality assessment of an important visual
cluster algorithm, namely, the Self-Organizing Map (SOM) technique. We define, measure, and visualize the
notion of SOM cluster quality along a hierarchy of cluster abstractions. The quality abstractions range from
simple scalar-valued quality scores up to the structural comparison of a given SOM clustering with output of
additional supportive clustering methods. The suite of methods allows the user to assess the SOM quality on the
appropriate abstraction level, and arrive at improved clustering results. We implement our tools in an integrated
system, apply it on experimental data sets, and show its applicability.
The proliferation of data in the past decade has created demand for innovative tools in different areas of exploratory
data analysis, like data mining and information visualization. However, the problem with real-world
datasets is that many of their attributes can identify individuals, or the data are proprietary and valuable. The
field of data mining has developed a variety of ways for dealing with such data, and has established an entire
subfield for privacy-preserving data mining. Visualization, on the other hand, has seen little, if any, work on
handling sensitive data. With the growing applicability of data visualization in real-world scenarios, the handling
of sensitive data has become a non-trivial issue we need to address in developing visualization tools.
With this goal in mind, in this paper, we analyze the issue of privacy from a visualization perspective and
propose a privacy-preserving visualization technique based on clustering in parallel coordinates. We also outline
the key differences in approach from the privacy-preserving data mining field and compare the advantages and
drawbacks of our approach.
This research discusses a novel application of ternary plots to the visualization of network traffic data. These plots prove
to be enormously effective at identifying anomalous network activity and can be valuable in monitoring network activity
much more efficiently than can be done with existing techniques. The visualization was implemented in our existing
visualization infrastructure to reduce development time. Testing was performed on actual network traffic data collected
from a local network. Multiple anomalies were easily identifiable within the data set without any prior knowledge as to
the contents of the test file. This paper discusses the ternary plot and its application to network traffic data, the formulas
needed to calculate and display ternary coordinates, and the basic architecture for the visualization implementation.
Although the discovery and analysis of communication patterns in large and complex email datasets are difficult tasks,
they can be a valuable source of information. We present EmailTime, a visual analysis tool of email correspondence
patterns over the course of time that interactively portrays personal and interpersonal networks using the correspondence
in the email dataset. Our approach is to put time as a primary variable of interest, and plot emails along a time line.
EmailTime helps email dataset explorers interpret archived messages by providing zooming, panning, filtering and
highlighting etc. To support analysis, it also measures and visualizes histograms, graph centrality and frequency on the
communication graph that can be induced from the email collection. This paper describes EmailTime's capabilities,
along with a large case study with Enron email dataset to explore the behaviors of email users within different
organizational positions from January 2000 to December 2001. We defined email behavior as the email activity level of
people regarding a series of measured metrics e.g. sent and received emails, numbers of email addresses, etc. These
metrics were calculated through EmailTime. Results showed specific patterns in the use email within different
organizational positions. We suggest that integrating both statistics and visualizations in order to display information
about the email datasets may simplify its evaluation.
We introduce a framework and class library (GAV Flash) implemented in Adobe's ActionScript, designed with the
intention to significantly shorten the time and effort needed to develop customized web-enabled applications for visual
analytics or geovisual analytics tasks. Through an atomic layered component architecture, GAV Flash provides a
collection of common geo- and information visualization representations extended with motion behavior including
scatter matrix, extended parallel coordinates, table lens, choropleth map and treemap, integrated in a multiple, time-linked
layout. Versatile interaction methods are drawn from many data visualization research areas and optimized for
dynamic web visualization of spatio-temporal and multivariate data. Based on layered component thinking and the use of
programming interface mechanism the GAV Flash architecture is open and facilitates the creation of new or improved
versions of existing components so that ideas can be tried out or optimized rapidly in a fully functional environment.
Following the Visual Analytics mantra, a mechanism "snapshot" for saving the explorative results of a reasoning process
is developed that aids collaboration and publication of gained insight and knowledge embedded as dynamic
visualizations in blogs or web pages with associative metadata or "storytelling".
This paper deals with a 3D visualization technique proposed to analyze and manage energy efficiency from a data
center. Data are extracted from sensors located in the IBM Green Data Center in Montpellier France. These
sensors measure different information such as hygrometry, pressure and temperature. We want to visualize in
real-time the large among of data produced by these sensors. A visualization engine has been designed, based
on particles system and a client server paradigm. In order to solve performance problems, a Level Of Detail
solution has been developed. These methods are based on the earlier work introduced by J. Clark in 1976. In
this paper we introduce a particle method used for this work and subsequently we explain different simplification
methods applied to improve our solution.
In this paper, we use concepts from digital topology for the topological filtering of reconstructed surfaces. Given a finite set
S of sample points in 3D space, we use the voronoi-based algorithm of Amenta & Bern to reconstruct a piecewise-linear
approximation surface in the form of a triangular mesh with vertex set equal to S. A typical surface obtained by means of
this algorithm often contains small holes that can be considered as noise. We propose a method to remove the unwanted
holes that works as follows. We first embed the triangulated surface in a volumetric representation. Then, we use the 3D-hole
closing algorithm of Aktouf et al. to filter the holes by their size and close the small holes that are in general irrelevant
to the surface while the larger holes often represent topological features of the surface. We present some experimental results
that show that this method allows to automatically and effectively search and suppress unwanted holes in a 3D surface.
We describe a meta-notation devised to express the major structural characteristics in widely-used data visualizations. The
meta-notation consists of unary and binary operators that can be combined to represent a visualization. Capturing structural
features of a visualization, our meta-notation can be applied to match or compare two visualizations at a conceptual level.
For example, a user's request for a visualization can be compared with visualization tools' presentation capabilities. The
design of the operators is discussed as we present their underlying concepts and show examples of their use. To illustrate
how expressive the meta-notation is, we explore some commonly-used data visualizations. A benefit of our approach is that
the operators define a set of required capabilities on which a visualization system can be organized. Thus, the meta-notation
can be used to design a system that interconnects various data visualization tools by sending and receiving visualization
requests between them.
The use of timeline to visualize time-series data is one of the most intuitive and commonly used methods, and is
used for widely-used applications such as stock market data visualization, and tracking of poll data of election
candidates over time. While useful, these timeline visualizations are lacking in contextual information of events
which are related or cause changes in the data. We have developed a system that enhances timeline visualization
with display of relevant news events and their corresponding images, so that users can not only see the changes
in the data, but also understand the reasons behind the changes. We have also conducted a user study to test
the effectiveness of our ideas.
Visualization of multivariate data presents a challenge due to the sheer dimensionality and density of information. When
presenting the data symbolically, this high information dimensionality and density makes it difficult to develop a symbology
capable of displaying it in a single presentation. One approach to multivariate visualization involves creating symbols
with higher dimensionality. Higher dimensional symbols can be problematic, since they typically require significant human
attentive processing to interpret, offsetting their greater informational capacity. Although attempts have been made to develop
higher-dimensional symbols that are processed in a preattentive fashion, success has proven elusive. Recent cognitive
research indicates that outdoor scenes are processed in a preattentive manner. We evaluate outdoor scenes as a candidate
for developing an effective higher-dimensional symbology by generating proof-of-concept images and comparing them to