Electronic mail is a facility, analogous to postal mail, in which computers are used to compose, deliver, and receive
messages. Traditional electronic mail systems rely solely on text as the medium of communication. A multi-media
electronic mall application, Mail, combines the media of Rich Text, voice, images, and electronic documents to facilitate
With Mail, these various media can be integrated into a single message. The variety of available media and the complexity
of the messages that result from their combination make it important for Mail to have a simple user interface. It was
possible to develop a simple, graphically-based interface that would accommodate Mail's message complexity.
The integration of these diverse media is made practical by the rich operating environment in which Mail runs. Modern
advances in hardware, operating systems, libraries, and servers make possible this powerful multi-media electronic mail
Convolution mask is a basic and useful tool for image processing
allowing edge detection, noise cleaning, texture extraction . . . Nevertheless,
such operators are time consuming in the two following cases : when the
mask is large (until 64*64 for texture or correlation extraction ) and when
the convolution is iterated. A parallel architecture is efficient in order to
limit the processing time but necessitates an overlapping (2n lines and 2n
columns) of the fields affected to the different processors. Such an
overlapping allows a limited number of iterations ( n in the case of a 3*3
mask) . Thus, if we need more than n iterations, a refreshment of the
memory is done in order to process only " true " radiometries.
There are a variety of emerging technologies which will facilitate the incorporation of video into the work station
environment. These technologies, whether analog or digital, very often require the addition of special purpose
hardware to be added to the work station or personal computer. We have been investigating algorithms for the
display and integration of video image sequences into a PC using an off-the-shelf graphics adapter. The techniques
we employ are not as powerful as those taking advantage of special purpose hardware. In particular, no provisions are
made for either real-time capture or full screen motion. However, by providing the simultaneous display of multiple,
scalable, moving movies, "multi-media" applications which use preprocessed, window-sized video sequences can be
In this paper we describe both techniques for image reduction and coding and various strategies for decoding,
playing, and smoothly moving video image sequences.
Bit-mapped drawings are normally created by using the so-called "Paint Editor" through the provided
image primitives. A set of operation tools are also available for interactive manipulation of primitive
objects. Window-based operating environments, such as the IBM OS/2 Presentation Manager,1M provide
an Application Programming Interface to send messages to modules that actually carry out bitmap generation.
We observed that this message-driven architecture can facilitate the "capture and reapplication of
user operations" for application programs. Therefore, it is possible to create a set of bit-mapped drawings
that conform to certain properties through the same sequence of operations.
Two applications are described in this paper. The first is the creation of fonts with a consistent typeface.
The sequence of operations used in designing the typeface is automatically recorded. The user can then
generate bitmaps of other characters with the same typeface by repeating the same sequence of operations.
The second application is the generation of shadows of 3D objects. If a profile of an object is given, the
shadow can be synthesized by bitmap manipulation primitives. When the same sequence of primitives is
applied to different objects, shadows can be generated correctly.
An approach to integration of software and hardware for image production closely coupled with
real-time control of an instrument is described. Image production includes reconstruction, processing,
display, archiving and export. Feedback of information from images to instrument control
is an essential feature, often with a human operator in the loop.
Real-time control requires guaranteed response to external events, support for multiple processors
and a variety of interfaces to instrumentation. Image production involves implementation of
reconstruction and processing algorithms, a windows system, and network support. Both functions
need a consistent user interface and an adaptable programming environment . The UNIX
operating system running on a network of workstations meets the needs of image production
and is very well suited to user interface and software development for both imaging and control.
However, high performance real-time control is not possible under most available UNIX systems,
but can be accomplished using a real-time kernel running on separate hardware. This suggests
a multi-processor approach based on inter-process communication between the workstation and
The architecture consists of three hardware layers (UNIX workstations, board-level microcomputers,
and digital signal processors) and four software layers. The real-time software layers are DSP
microcode and the microprocessor code. The lower layer of workstation software is a multi-tasked
command-driven program tailored for instrument control and image production. The top layer
of workstation software consists of an icon or menu driven user interface. A research instrument
for nuclear magnetic resonance imaging will serve as an example of this approach.
The ideal image processing system would be one that people already know how
to use. Its components would be interchangeable with other computers in the
office and in the factory. It could be quickly programmed to do its job by nonprogrammers
and non-typists. And it would communicate and print reports
over the same network used by all the other computers in your company.
Does this sound like science fiction? It's not. The image processing system in
this scenario represents a trend toward standardization in the industry. Image
processing suppliers are realizing that they must move away from their
existing proprietary architectures and customized software systems in favor of
standard computers having a wide base of support hardware and software, and
a wider pool of knowledgeable users.
We consider what is needed to create electronic document libraries which
mimic physical collections of books, papers, and other media.
The quantitative measures of merit for personal workstations-cost, speed, size of
volatile and persistent storage-will improve by at least an order ofmagnitude in the next
decade. Every professional worker will be able to afford a very powerful machine, but
databases and libraries are not really economical and useful unless they are shared. We
therefore see a two-tier world emerging, in which custodians of information make it
available to network-attached workstations. A client-server model is the natural
description of this world.
In collaboration with several state governments, we have considered what would be
needed to replace paper-based record management for a dozen different applications.
We find that a professional worker can anticipate most data needs and that (s)he is
interested in each clump of data for a period of days to months. We further find that
only a small fraction of any collection will be used in any period. Given expected
bandwidths, data sizes, search times and costs, and other such parameters, an effective
strategy to support user interaction is to bring large clumps from their sources, to
transform them into convenient representations, and only then start whatever investigation
is intended. A system-managed hierarchy of caches and archives is indicated.
Each library is a combination of a catalog and a collection, and each stored item has a
primary instance which is the standard by which the correctness of any copy is judged.
Catalog records mostly refer to 1 to 3 stored items. Weighted by the number of bytes
to be stored, immutable data dominate collections. These characteristics affect how
consistency, currency, and access control of replicas distributed in the network should
We present the large features of a design for network docun1ent/image library services.
A prototype is being built for State of California pilot applications. The design allows
library servers in any environment with an ANSI SQL database; clients execute in any
environment; conimunications are with either TCP/IP or SNA LU 6.2.
An image based geographic information system (GIS) using an efficient combination of image (raster) data and
vector data has been developed. The system has been realized on a 32 bit engineering work station (TOSHIBA AS3000
series) with special hardware for high speed image data retrieval and display. The system adopts several (up to four)
5.25 inch optical disks to store the image data, because of compactness and removability. It uses image data for a base
map in combination with vector data and attribute data (text or numeral data), while the conventional GIS uses vector
data for the base map because of its lack of image processing capability.
Two new techniques employed in the system are described. One is a technique to obtain a seamless base map from
a conventional printed atlas. It is necessary to adjust the mismatching caused by the distortion between two neighboring
maps. The other is a technique for distributingC image data to multi-optical-disks and concurrently retrieving from
It has proved that the proposed image handling techniques reduce the initial base map input cost, and make the
man-machine interface powerful, with such functions as a smooth scmll, movies, color photographs, and a variety' of
Document scanning is now an accepted part of office procedure, allowing the incorporation of digitized images into new documents and the conversion of scanned print into ASCII by optical character recognition ( OCR). Often document pages contain more than one form of information - textual, graphical and/or pictorial. Segmentation of document images into these three categories is feasible with the aid of image processing. Projections of the thresholded document images in conjunction with autocorrelation are used to check text alignment. Then the edge shifting properties of the rank filter are used to coalesce image regions containing text into solid near-rectangular blocks. Pyramidal reduction is combined with the filtering to ease the computational burden. Horizontal and vertical projections are used to segment whole pages recursively into homogeneous blocks
whose properties are then analysed. Applications forseen for the image segmentation include modified facsimile systems, achievement of
artifact-free OCR and conversion of document images into files with separate formats for text, graphics and pictures.
A paradigm for the visualization of image interpretation systems is described. The
paradigm uses visual interactions with the human observer to extract visual knowledge
from the user and format it into rules, thresholds and choice of relevant attributes. The
detailed description of the algorithm is presented through the analysis of an example of
grouping line-segments into straight lines.
Two different levels of abstraction are involved in the proposed learning mechanism.
One involves the calculation of values and thresholds and is based on a statistical analysis
of the user's chosen examples. The other mechanism deals with reformatting and rewriting
of rules and is a symbolic process that belongs to the more abstract levels of interpretation.
The mechanism for reconsideration incorporates the two concept learning paradigms,
extension and correction. The correcting of formerly learned Rules is based on additional
examples chosen by the user and invokes a new, low level, calculation of thresholds only
as a last resource.
This paper introduces a general purpose method for black and white image
segmentation, which is based on the design of a rule based-system. Rules
integrate general knowledge, which is completely independent of environment
and scene content ; they process simultanuously regions and lines by merging
similar regions, splitting non uniform ones, connecting lines or deleting non
significant ones. Rules are selected according to the nature of data under
analysis, evaluated by means of a set of performance measurement and ordered
according to their efficiency ; only the most efficient ones are really
fired. Furthermore, adapted control strategies, using dynamic data selection,
allow to focuse the process on the required parts of the image. This paper
describes some examples of the rule, and the set of performance measurement
for rule efficiency evaluation.
By analyzing the tracking scenario thoroughly, the gate tracking scenario may fall into four
states:(Si) Common tracking state:main state in which the target moves smoothly and there is
little disturbance of foreground and background. (52) the target unstable state in the gate:the
target violent movement and the target existence in strong disturbances. (83)The taret is
escaping from the tracking gate. (S4)the target is completely Iost.Fuzzy mathematics is
introduced to describe those four states and transforms among them.The finite state fuzzy
automachine is formed. Several statistical parameters have been proved to be very effective to
describe the four states by mean of fuzzy concepts. The automachine can provide both the fuzzy
probabilities of four tracking states and the transform matrix among them. The elements of the
matrix which present the transform probabilities among the four states are set up by the
combination of statistical parameters. The fuzzy automachine can provide the most probable
state decision to guide the multi-state gate tracker to chose the proper tracking tactics at any
time. The different tracking algorithm can be applied to different state in order to keep on
tracking under both the common tracking situation and uncommon tracking situation6.
Region analysis of binary images includes standard "blob counting" as well as the computation of geometric
properties (area, perimeter, moments, number of holes) of the connected components or regions
in an image. Software algorithms for region analysis are well known, and special purpose vendor
systems are now available for real-time and near real-time binary region analysis. In this paper, we
describe algorithms using a combination of software and inexpensive image hardware (e.g. boards from
Matrox, Imaging Technology) that perform region analysis on 512x480 images in times ranging from
about a half second ( < 10 regions) to a few seconds (several hundred regions) on a PC/AT (80286)
based system, and well under a second for all the cases we have tested on a 80386 based system.
This paper proposes a feedback process approach to shape transformation in OCR character recognition for
achieving higher recognition rates. An input image is deformed by warping function so that the stroke in the input
image and the corresponding stroke in the reference image will overlap each other as much as possible. Pattern
matching between both images is accomplished at several blurring levels, and a gradient method is applied to obtain
the optimal warping parameters for minimizing the distance between the images in each blurring level. Third order
polynomials by expanding Affine transformations are used as warping functions.
Two kinds of pattern matching experiments were carried out. First, the image generated from an original image
by warping function with certain parameters was matched with the original image by the proposed method. Second,
machine-printed character images of different fonts were used for pattern matching. Both experimental results
show that the optimal warping parameters for global minimization of the distances were successfully determined by
the combination of a coarse-to-fine pattern matching and a gradient method. The proposed method is applicable
to a variety of shape compensations, based on other matching criteria.
Character recognition methods can be categorized into two major approaches. One is pattern matching, which is
little affected by topological changes such as breaks in strokes. The other is structural analysis, which tolerates distorted
characters only if the topological features of their undistorted versions are kept.
We developed a new recognition method for hand-written numerals by combining the merits of the two
approaches. The recognition process consists of three steps: (1) an input character is recognized by a patternmatching
method, which reduces the number of possible categories to 1.5 on the average, (2) the character is yenfled
to be true, false, or uncertain by a structural analysis method that we have newly developed, and (3) special
heuristic verification logics are applied to uncertain characters.
In the second step, the new structural analysis method uses the positions and directions of terminal points
extracted from thinned character images as a main feature. The extracted terminal points are labeled according to a
structural-feature distribution map prepared for each category. The generated labels are matched with template
label sets constructed by statistical analysis. The characteristics of the method are as follows: (1) it copes with distortion
of hand-written characters by using distribution maps for the positions and directions of feature points, and
(2) distribution maps can be automatically generated from statistical data in learning samples and easily tuned interactively.
The merits of combining the two methods are as follows: (1) the advantages of both pattern matching and structural
analysis are obtained, (2) the probabilities of steps 2 and 3 needing to be executed are 22% and 9% respectively,
which hardly affect the total processing time, and (3) as a result of steps 1 and 2, only a small number of
special logics are required.
In a test using unconstrained hand-written characters of low quality, the recognition rate and substitution rate
were 95.2% and 0.42% respectively. A recognition speed of 80 characters/second was achieved on a small hardware
Computer systems thatprocluce color images, usually consist ofcombinations ofhardware and software components that perform
different functions, such as capturing, synthesizing or editing images, incorporating images into documents, proofing, and
rendering results. Images, and the documents containing them, must be stored temporarily or archivally, and transmitted from
component to component. Users will be able to operate such systems more conveniently and flexibility if the parts communicate
with each other using a standard interchange format, allowing any conforming module to communicate with any other.
This paper discusses the requirements for such an interchange color space, comparing them to some ofthe criteria used to measure
traditional color spaces. Any computer interchange space should employ the principles of colorimetry to provide deviceindependence,
which is necessary if negotiation between components is to be avoided. Other requirements are accuracy and
computational efficiency of transforms to and from device-dependent and internationally standard spaces, ability to represent all
visible colors, maximizing the amount of space occupied by the most likely image color gamuts, and robustness against
quantization and roundoff errors. This paper proposes ways to measure the performance of color spaces against the defined
requirements and applies the tests to several well-known color spaces.
This paper presents a method to automatically obtain a contour representation of character
patterns and rapidly generate high-quality characters of various sizes. Contour representation
generates contour vectors and attribute data. The contour vectors are generated
without losing information about the characters' shape. The attribute data is used to correct
non-uniform widths of vertical and horizontal lines, which is often a problem when characters
are transformed to lower resolution. The character generation step reproduces the characters
using data obtained in the contour representation step. Since this method represents characters
by contours, it does not require as much data. And since non-uniform line widths are
corrected rapidly, the characters are of high quality.
In this paper, a technique to convert page description language (PDL) code to a bitmap color image using an
optical image memory is discussed. The technique is useful in developing a low cost PDL compatible high resolution
graphic printing device for desktop publishing applications. The optical image memory in this discussion is a raster
scan addressable device which is write-only in the sense of electrical digital access, and its content is continually
accessible by an analog optical mean. Utilizing the low cost optical image memory, the algorithm can rasterize
PDL code without large amounts of expensive digital image frame memory. Since the optical image memory is a
raster scan addressable device, the rasterization ofeach pixel ofthe color image must be done within the access time
of each pixel writing. Some novel techniques are introduced in this algorithm. System development, performance
analysis, and the application of the algorithm will be presented.
It has become increasingly important to get high quality prints from TV or other video signals in the desktop
publishing and multi-media environment. Normally, a video signal has very low spatial resolution compared with
the spatial resolution of printing devices. Hard copy from direct printing of a video signal is usually very poor in
quality. To improve the video signal spatial resolution for printing, the algorithm presented in this paper applies
an adaptive spatial interpolation technique to the video signal. The algorithm is based on a computer graphics
curve-fitting algorithm with the addition of edge-restricted interpolation to sharpen edges and to enhance image
details. Therefore, the new interpolation algorithm will not only improve the spatial resolution but also improve
the sharpness of the printed image. In this paper, the aspects of theoretical development of the new algorithm
will be discussed and some simulation results will be shown and evaluated to present the effect of the algorithm.
A regional information guidance system has been developed on an image workstation. Two main features of this
system are hypermedia data structure and friendly visual interface realized by the full-color frame memory system.
As the hypermedia data structure manages regional information such as maps, pictures and explanations of points
of interest, users can retrieve those information one by one, next to next according to their interest change. For
example, users can retrieve explanation of a picture through the link between pictures and text explanations. Users
can also traverse from one document to another by using keywords as cross reference indices. The second feature is
to utilize a full-color, high resolution and wide space frame memory for visual interface design. This frame memory
system enables real-time operation of image data and natural scene representation. The system also provides half
tone representing function which enables fade-in/out presentations. This fade-in/out functions used in displaying
and erasing menu and image data, makes visual interface soft for human eyes.
The system we have developed is a typical example of multimedia applications. We expect the image workstation
will play an important role as a platform for multimedia applications.
A technique is presented for intraframe color image data compression which produces visually lossless imagery
compared to the original. This algorithm consists of a color vector quantizer operating in the Luv uniform color
space, followed by a reversible codeword assignment strategy that uses prediction to achieve conditional entropy type
bit rates. Unlike differential pulse code modulation (DPCM), predicted values are used for codebook selection instead
of the computation and coding of a residual signal.
We built a multimedia document system for the SIGGRAPH Interactive Proceedings to demonstrate the potentials and
challenges in using technology to capture better the essence of SIGGRAPH conferences. The prototype system uses the
NeXT computer system to present textual, mathematical, illustrative, colorful, audio, video and animated material. Special
attention was given to including tools for interactive manipulation of images included in typical SIGGRAPH papers.