Analysts use visual analytical systems for exploring and analyzing structured datasets and increasingly require tools to access supporting documents, research papers and news reports. Visual analytical systems for text cor- pora typically concentrate on techniques for exploring only document collections. We have developed a system for visualizing and analyzing both document collections and structured datasets. We describe a document visu- alization tool called InfoMaps developed by us within Weave, an open source framework for data exploration and analysis. Users of Weave analyzing datasets can search for documents from the web and networked repositories, and can use the matched documents as a part of their analysis. Conversely documents in InfoMaps can be used to identify relevant data subsets. In this paper, we discuss InfoMaps, its use and integration with other visual tools of Weave and our approach to the information extraction and integration process.
Iterative clustering (e.g. K-Means, EM) is one of the most commonly used clustering methods, which attempts to
iteratively find a local optimum starting from an initial condition, including initial centroids and initial number of
clusters. For iterative clustering, research has shown that the initial conditions are crucial to clustering quality and
running time of a clustering computation. Using a novel visualization tool, CComViz (Cluster Comparison
Visualization), we present an innovative approach to refine the initial centroids and the number of clusters by visually
analyzing multiple clustering results generated by different clustering algorithms. As an example, we apply our new
approach to a gene expression case study for generating a better and converging clustering. The proposed approach is
considered to be an extension to cluster ensembles since the original data sources are reused, while in classic cluster
ensembles they are not.
Tightly coupled visualization and analysis is a powerful approach to data exploration especially for clustering. We describe such a specific integration of analysis and visualization for the evaluation of multiple partitions of a data set. Partitions are decompositions of a dataset into a family of disjoint subsets. They may be the results of clustering, of groupings of categorical dimensions, of binned numerical dimensions, of predetermined class labeling dimensions, or of prior knowledge structured in mutually exclusive format (one data item associated with one and only one outcome).
Partition or cluster stability analysis can be used to identify near-optimal structures, build ensembles, or conduct validation. We extend Parallel Sets to a new visualization tool which provides for the mutual comparison and evaluation of multiple partitions of the same dataset. We describe a novel layout algorithm for informatively rearranging the order of records and dimensions. We provide examples of its application to data stability and correlation at the record, cluster, and dimension levels within a single interactive display.
Although there are a number of visualization systems to choose from when analyzing data, only a few of these allow for the integration of other visualization and analysis techniques. There are even fewer visualization toolkits and frameworks from which one can develop ones own visualization applications. Even within the research community, scientists either use what they can from the available tools or start from scratch to define a program in which they are able to develop new or modified visualization techniques and analysis algorithms. Presented here is a new general-purpose platform for constructing numerous visualization and analysis applications. The focus of this system is the design and experimentation of new techniques, and where the sharing of and integration with other tools becomes second nature. Moreover, this platform supports multiple large data sets, and the recording and visualizing of user sessions. Here we introduce the Universal Visualization Platform (UVP) as a modern data visualization and analysis system.
Several authors have developed automated parameterized visualization generation systems14,15,16. All generate classic
visualizations or combinations of such visualizations. A vector space model of visualization was proposed by Hoffman18,
leading to the development of new visualizations and the concept of interpolating visualizations. These new
visualizations provide alternative representations and insights into data and have been applied successfully in numerous
data analysis problems including gene expression, drug discovery, clinical trials, toxicogenomics, and medical
informatics23. In this paper we elevate this vector space model to include analytic visualizations, ones with tightly
coupled analysis, such as Self-Organizing Maps (SOMs) and Multi-Dimensional Scaling (MDS). We describe our new
model and provide an example interpolation of a SOM and a scatterplot with a simple data set (the Fisher Iris data) and a
more complex and larger one (microarray gene expression data).
Visualization has proved to be a suitable paradigm for the analysis and exploration of datasets. In the data mining cycle, visualization has been mainly focused on data visualization and output generation. However, besides datasets, many other entities need to be explored and understood by users and analysts. In this paper, we describe the role of visualization in the data mining process, and we present a model to support the interaction between users and data mining entities. We discuss visualizations of datasets, parameter spaces of data mining algorithms, models induced from datasets, and patterns generated by the application of data mining algorithms to datasets. We have developed a Java-based testbed, that implements the extended data mining model with visual support to interact with datasets, models, parameter spaces, and patterns. Experimental results based on several public datasets, data mining algorithms, multidimensional visualization techniques, and other novel visualizations, show clearly the benefits of the integration of visualization in the data mining process.
This paper focuses on the visualization of application code. The goal is to provide a visual representation of the code that relieves the user from having to rely solely on textual representations. The visual representation of code can be correlated with visual displays of the data and displayed simultaneously within a single display. This prevents the user from having to change perceptual contexts or focus to a separate window. The visualization techniques are themselves based on the familiar metaphor of picture frames. Since we wish to provide the application data within the same display, representing operations and instructions as a frame around the application data provides a merged display with the data and operations correlated in an unobtrusive manner. The borders also work well by matching with the human perceptual systems ability to differentiate and characterize edges. The borders can be as simple as rectangles with the characteristics of the rectangle (i.e., thickness, color, consistency, etc.) identifying the type of operation. Portions of the frame can be made representative of the underlying data (i.e., current and termination conditions).
Traditionally, visualization systems have focused on the visual sense. However, with the advent of multimedia and virtual reality systems, other senses such as sound and touch are being slowly incorporated into systems. Even in the visual channel, the majority of systems depend on the perception of geometry through graphical concepts such as lines, fillareas, windows, and raster pixmaps to provide the visualization feedback. Sound is being effectively used in visualization systems and is increasingly being integrated into mainstream systems. However, we have not made much progress in developing a fundamental understanding of interaction in non-geometric representational spaces. We are interested in extending simple interactions, such as zoom and pan, into other domains such as sound. Pitch is a perceptual quantity of sound that is associated with the physical quantity frequency. We describe how zoom and pan operations in pitch are supported. Formal definitions for these operations are also provided. Finally, we describe a prototype system for such interactions.
We present a conceptual model for interaction in exploratory data visualization systems. This model extends interaction to systems that use geometry, texture, color, and sound to present data to the user. Such systems need to be highly interactive and user centered. We extend interaction to utilize not only geometry, but also color and sound representations of data. Our conceptual model supports interaction in multiple representational spaces. A representational space is a space, such as a color space or a sound space, that is used in the representation of data. The model extends the conventional visualization output pipeline, separating the data from its representations. Interaction operations can be performed on either part of the pipeline.
This paper describes an `interaction interface' for visual database exploration. Visualization has been traditionally thought of as an output technology: this research places visualization into a broader context and aims to develop an input model for the visual exploration of databases. We first describe the data infrastructure of an integrated database-visualization system. We then extend the definition of visualization to include the data interactions allowed over the visualized image. We finally present the portion of this system that describes how interactions over data visualizations are mapped to the targets of the visual interaction: the various data objects in the system, or the database itself. In this way, the user is brought closer to the data because interaction is over a visualization, which is perceived by the user, and the correct effect of the interaction is automatically mapped to the appropriate underlying data object. We build on a fundamental taxonomy of empirically-developed data interaction, and use these interaction specifications in our object-oriented design.
Debugging concurrent systems has been shown to be much more complex than for serial systems; this is further complicated by the number of processors that may be involved in any operation. The correctness of such systems are equally as important as serial systems. This places a considerable amount of extra demand on the debugging environment. Additional capability must be provided with the debugging environment to offset the complexity. We describe work related to the visualization of data associated with concurrent systems to aid users in comprehending the operation and correctness of their concurrent applications.
We describe a software platform in which large DNA sequence datasets may be visualized by techniques which readily reveal patterns and insights. Initially we have focused on providing accurate statistical visualizations rather than qualitative presentations. The first application of this platform visualizes properties of DNA sequence strings of any size as a function of string position (for example, in a large chromosome). We provide an example in which we visualize the ratio of found to expected frequency of occurrence for specific sequence strings (AAAA and TTTT) and show these reveal interesting patterns in that DNA string (yeast chromosome III). For flexibility, any new function, calculated from the sequence string, may be added to the software platform.
We discuss the integration of visualization and supercomputing in a low cost environment. Computational requirements continue to increase dramatically as computational capabilities do. Yet most architectures still separate both processes. The computation is done on one system and the visualization on another. We describe an innovative architecture developed by the Supercomputer Research Center of the Institute for Defense Analysis within which the integration of visualization and supercomputation is realized. Immediate gains are obvious: program visualization, real-time computational steering, and rapid porting of current applications. We describe the issues in porting our experimental visualization software and issues we encountered. We describe limitations and advantages of the hardware/software coupling. We also discuss a proposed extension of that architecture.
One promise of telerobotics is the ability to interact in environments that are distant (e.g., deep sea or deep space), dangerous (e.g., nuclear, chemical, or biological environments), or inaccessible by humans for political or legal reasons. A key component to such interactions are sophisticated human-computer interfaces that can replicate sufficient information about a local environment to permit remote navigation and manipulation. This environment replication can, in part, be provided by technologies such as virtual reality. In addition, however, telerobotic interfaces may need to enhance human-machine interaction to assist users in task performance, for example, governing motion or manipulation controls to avoid obstacles or to restrict interaction with certain objects (e.g., avoiding contact with a live mine or a deep sea treasure). Thus, effective interactions within remote environments require intelligent virtual interfaces to telerobotic devices. In part to address this problem, MITRE is investigating virtual reality architectures that will enable enhanced interaction within virtual environments. Key components to intelligent virtual interfaces include spoken language processing, gesture recognition algorithms, and more generally, task recognition. In addition, these interfaces will eventually have to take into account properties of the user, the task, and discourse context to be more adaptive to the current situation at hand. While our work has not yet investigated the connection of virtual interfaces to external robotic devices, we have begun developing the key components for intelligent virtual interfaces for information and training systems.
The authors introduce the Exvis exploratory data visualization system. This system uses a display technique based on visual texture perception to reveal structure in multidimensional data and includes a sound output facility for simultaneous sonification of data. The elementary unit of the display is a glyph, or 'icon,' whose attributes are data-driven. Global display controls for icon geometry, sound, and color have been added to the original system. A global control is a transformation that applies to the entire icon completely independently of the mapping of specific data parameters to specific icon attributes. These controls allow the user to maximize both the visual contrast and the auditory contrast available for a given choice of icon and a given mapping of data parameters to icon attributes, and they allow the user to selectively enhance different features in an iconographic display. Using these controls to manipulate displays of computer-generated multidimensional data, the authors have been able to obtain pictures that exhibit well-differentiated texture regions even though the data that produce these regions have no differences in their first-order statistics. The global display controls are most interactive when Exvis is implemented on a computing platform such as the Connection Machine, which can redraw an iconographic picture as rapidly as the user can manipulate the controls.
In this paper we discuss data exploration as a particularly difficult case within the general problem of data
visualization. We describe (1) a novel graphic technique for displaying multidimensional data visually and
(2) an auditory display integrated with the visual display that allows us to represent multidimensional
data in sound. The visual/auditory display employs an "iconographic" technique that seeks to exploit
the spontaneous perceptual capacity to sense and discriminate texture. Structures in data to be analyzed
can appear, both visually and aurally, as distinct textural regions and contours when the data are
represented iconographically. Sound can be used to reinforce the visual presentation or to augment the
dimensionality of the visual display. The immediate focus of the work reported here is to investigate how
best to transform data into perceptible visual and auditory textures, that is, how best to "perceptualize"
the data. A key problem we discuss is deciding which fields of a multidimensional data set should be
represented in the visual domain and which in the auditory domain. This activity is part of the University
of Lowell's Exploratory Visualization (Exvis) project, a multidisciplinary effort to develop new paradigms
for the exploration and analysis of data with high dimensionality.
28 January 2008 | San Jose, California, United States
Visual Data Exploration and Analysis IV
12 February 1997 | San Jose, CA, United States
Visual Data Exploration and Analysis III
31 January 1996 | San Jose, CA, United States
Visual Data Exploration and Analysis II
8 February 1995 | San Jose, CA, United States
SC1023: Modern Data Visualization
Visualization, whether scientific or information, is now being used by almost every organization today. Scientific visualization is a key component of medical, engineering and physical data analysis and Information Visualization breathes life into business, financial and textual data. Whether in the research and development of electronic imaging devices or simply getting directions from one address to another, we have become dependent on visualization. In this course, we will briefly cover the history of visualization and its relationship to computer graphics, imaging, and visual analytics, review the most common and successful visualizations, discuss advanced visualizations, many integrated with analysis (the new visual analytics) and explore on-going research, especially in the area of high dimensional data visualizations. We will also provide an overview of various visualizations systems currently available.