Paragon-IPS is a comprehensive software system which is available on virtually all generations of image processing hardware. It is designed for an image processing department or a scientist and engineer who is doing image processing full-time. It is being used by leading R&D labs in government agencies and Fortune 500 companies. Applications include reconnaissance, non-destructive testing, remote sensing, medical imaging, etc.
Advances in workstation technology have researchers spending less time building and computing their data sets, and more time experimenting with them. Conventional computer workstations allow researchers to interactively modify images and other data, while the results of these changes are graphically rendered. Software developers are finding it necessary to redesign their applications to be more modular and interactive, taking advantage of the changes in workstation performance and architecture. In an effort to provide some of the tools necessary to assist this software effort, the University of Lowell's Center for Productivity Enhancement has developed an object oriented image processing environment. The Imaging Kernel System (IKS) is a comprehensive environment for image processing and analysis. It is viewed here as one module in an object oriented visualization environment designed to utilize the capabilities of today's visual workstation.
This paper describes Method Registration, a technique for the creation of a highly extensible application environment. Although applicable to a variety of application domains, we are using Method Registration as the basis for a flexible imaging environment. Method Registration uses a simple declarative language to provide the application with knowledge about how to incorporate new functionality into the environment. The imaging environment we are developing will provide a visual programming based user interface to provide a very flexible mechanism for the end user to easily define complex image processing operations. Method Registration will provide the support to allow new image processing functionality to be added by developers without burdening them with interfacing to other layers of the application.
Because of the lack of powerful software tools, the developer of computer based imaging applications must be capable of both programming computer systems and conducting imaging research. This paper discusses some of the human interface issues that were encountered during the implementation of a commercial product which provides integrated experimentation and programming capabilities, using separate interfaces. One interface, designed to facilitate specification and execution of processes is described. The use of hierarchies, icons, dynamics and color helps to manage complexity by drawing attention to the important information.
One of the most exciting current developments in the area of computer graphics is the emergence of the microcomputer-based engineering workstation. A recent market study suggests that over the next decade, sales will grow at about 1.5 times the rate of growth for the rest of the computer graphics industry...reaching sales of over $4.5 billion by 1990. This paper will describe these workstations, discuss present capability and future trends.
Image processing applications have traditionally been the domain of special-purpose pixel-processing machines. But with the introduction of graphics supercomputers last year, an entirely new class of imaging platforms has emerged. This class integrates into a single hardware platform the imaging functions that previously required multiple platforms, and provides a level of computational capability equal to, and in some cases greater than, that offered by specialized pixel-processing machines. At the same time, this new class offers a wider range of related computional and visualization features, such as geometry processing, real-time floating-point computation, compositing, high-level languages, and support for multiple users, among others.
Usability is an important component of workstation quality. Just as reliability, performance, and functionality can be evaluated and measured, so should usability. This paper describes a general approach to evaluate usability. Several specific methods of usability testing and measurement are then presented in detail.
As users increasingly employ graphics workstations for visualization applications, old benchmarks and measures --which focus narrowly on throughput rates -- are becoming increasingly inadequate. New evaluation methods are needed that take into account the complexities of visualization technology, including interactive manipulation, perceptual color models, sterescopic viewing and intuitive graphics input devices. Also, the most effective evaluations will be those that account for the hardware/software/user system, rather than focusing on hardware-only considerations.
Images have begun to play an important part in office applications such as electronic publishing, automatic data entry (optical character recognition), and image communications (facsimile). While not every workstation has a scanner, they are being thought of as standard peripherals satisfying a variety of user needs.
High resolution displays are one of the key elements that distinguish user oriented document finishing or publishing stations. A number of factors have been involved in bringing these to the desktop environment. At Sigma Designs we have concentrated on enhancing the capabilites of IBM PCs and compatibles and Apple Macintosh computer systems.
A parallel computer for image processing has been built by the Eastman Kodak Company. This computer is composed of an arbitrary number of transputers connected in a toroidal array with a controller node and is accessible over the VME bus from a host computer. Many benchmarks have demonstrated the superior performance such a parallel machine can give. A special, high-level language for image processing on the parallel processor has been developed that provides simple but powerful constructs for operating on images.
The world of computer-mediated picturing is often divided into computer graphics and image processing, or graphics and imaging, or image synthesis and image analysis. A slightly different division is proposed here that captures the principal distinction of the earlier terminology but cleanly avoids certain confusions, such as how to classify an electronic painting/retouching application. The distinction is drawn on the representation of the data to be pictured - whether it is in terms of geometric elements or sample arrays. The two worlds are quite complex so convenient coordinate systems, called sophistication meters, are proposed for both domains.
The Stellar Graphics Supercomputer Model GS1000Tm is the definitive member of a new class of machines incorporating the architecture and performance of near-supercomputers with a level of graphics performance well beyond the range of traditional terminals and workstations. Designed to eliminate the architectural bottlenecks which limit overall throughput and performance, the GS 1000 includes a number of novel features which significantly narrow the gap between "peak" and achievable performance. This paper will include a brief summary of the overall GS 1000 architecture, but will focus in some detail on the hardware and software components related to graphical computation. Examples of the applicability of Stellar's Virtual Pixel MapsTM facility to both geometric graphics and image processing are provided, as are illustrations as to how these two forms of image computing can be applied to a single application.
Workstations have a large number of advantages for use as a personal computing resource. Unfortunately, currently these machines do not have enough performance to provide interactive 2-D and 3-D imaging capability, and aren't likely to in the foreseeable future. Consequently, they must be accelerated in some fashion. Accelerators need to be physically, visually, and computationally integrated with the workstation to be of maximum effectiveness. Furthermore, the rapidly changing requirements and increasing functionality of today's applications demand a high level of flexibility, impossible to meet with a traditional hardwired image processor architecture. This paper will describe the development of one form of the new breed of imaging accelerator and experiences (and lessons learned) from its application to a variety of problems.
Requirements for fast access to larger data sets and the desire to provide better interactions between high performance computers and visualization subsystems are calling for a new generation of workstation networking facilities. This new generation of networking will be 10 to 50 times faster than the existing one based on the presently established Ethernet and TCP/IP industry standard components. It will take advantage of both increased raw hardware bandwidth (100-200 Mbit/s) and also improved efficiencies (50-80%). In the near term and for selected applications, these facilities will replace or supplement Ethernet and TCP/IP. The FDDI physical layer protocol based on fiber optic technology and logical layer protocols based on simpler and more efficient ("light-weight") software models offer prospects for these near term, mid-range-performance applications. However, there are needs as well for even higher networking performance at 1 Gbit/s. These needs will also drive special solutions for workstations in the near term - and perhaps even standards where workstations are used in combination with supercomputer processing. In all cases the implications for system architectures are significant, and numerous changes are required in today's workstation products to support these new levels of network performance.
The medical, military, scientific and industrial communities have come to rely on imaging and computer graphics for solutions to many types of problems. Systems based on imaging technology are used to acquire and process images, and analyze and extract data from images that would otherwise be of little use. Images can be transformed and enhanced to reveal detail and meaning that would go undetected without imaging techniques. The success of imaging has increased the demand for faster and less expensive imaging systems and as these systems become available, more and more applications are discovered and more demands are made. From the designer's perspective the challenge to meet these demands forces him to attack the problem of imaging from a different perspective. The computing demands of imaging algorithms must be balanced against the desire for affordability and flexibility. Systems must be flexible and easy to use, ready for current applications but at the same time anticipating new, unthought of uses. Here at the University of Washington Image Processing Systems Lab (IPSL) we are focusing our attention on imaging and graphics systems that implement imaging algorithms for use in an interactive environment. We have developed a PC-based imaging workstation with the goal to provide powerful and flexible, floating point processing capabilities, along with graphics functions in an affordable package suitable for diverse environments and many applications.
Modern workstations need to address aspects of visual computing. These image workstations are characterized by programmability, flexibility, and interactivity. As a problem moves from a general processing problem into the processing of graphic elements and picture elements, the processing and data storage requirements explode. To meet this demand, specialized -- yet programmable -- devices have been designed and integrated into today's graphics workstation.
The Small Computer System Interface (SCSI) is used to interface an intelligent scanner to host computer systems. The SCSI-2 scanner command set provides an efficient method to control complex scanner operations and manage the large amounts of data produced by the device.
Document processing systems based on electronic imaging technology are evolving rapidly, motivated by technology advances in optical storage, image scanners, image compression, high speed digital communications, and high resolution displays. These evolving systems require high speed reliable image scanning systems to create the digital image data base that is at the heart of the applications addressed by these evolving systems. High speed production document scanners must provide the capability of converting a wide variety of input material into high quality digital imagery. The required capabilities include: (i) the ability to scan varying sizes and weights of paper, (ii) image enhancement techniques adequate to produce quality imagery from a document material that may depart significantly from standard high contrast black and white office correspondence, (iii) standard compression options, and (iv) a standard interface to a host or control processor providing full control of all scanner operations and all image processing options. As electronic document processing systems proliferate, additional capabilities will be required to support automated or semi-automated document indexing and selective capture of document content. Capabilities now present on microfilming systems will be required as options or features on document capture systems. These capabilities will include: endorsers, bar code readers, and optical character recognition (OCR) capability. Bar code and OCR capabilities will be required to support automated indexing of scanned material, and OCR capability within specific areas of scanned document material will be required to support indexing and specific application needs. These features will also be supported and controlled through a standard host interface. This paper describes the architecture of the TDC DocuScan Digital Image Scanner. The scanner is a double-sided scanner that produces compressed imagery of both sides of a scanned page in under two seconds. Full control of all scanner operation and processing options is provided through a SCSI interface. The scanner architecture has been designed to accommodate OCR and bar code recognition capability and other features now in common use in microfilming equipment, and the technical approach to these extensions will be described.
A novel device for digitizing 35 mm transparencies has been designed by Barneyscan Corporation to provide medium-high resolution image input to microcomputers. The basic concept of the scanner is that a 35 mm slide moves on a linear track in front of a lens and a linear sensor while illuminated with a slit light source. Three passes are made for color digitizing; red, green and blue separation filters are interposed for each respective pass. A single pass greyscale mode is also included. The 1024-element photo array measures each of 1520 lines in the direction of slide travel, yielding 1.5 million pixels. Each pixel is digitized to 256 levels (8 bits) of intensity per primary color, or 16,800,00 shades. The resulting file size is 4.5 megabytes. Extensive software for pre-calibration and for automatic control of scan settings is included. Image processing software for cropping, resizing, color correction, sharpening and file format conversion is also included. Resultant images can be displayed, modified and edited, saved to disk or to hard copy, or distributed over a network.
For a long time, automatic reading has been limited to optical character recognition. one year ago, except for one high end product, all industrial software or hardware products where limited to the reading of mono-column texts without images. This does not correspond to real life needs. In a current, company, pages which need to be transformed into electronic form are not only typewritten pages, but also complex pages from professional magazines, technical manuals, financial reports and tables, administrative documents, various directories, lists of spare parts etc... The real problem of automatic reading is to transform such complex paper pages including columns, images, drawings, titles, footnotes, legends, tables, occasionally in landscape format, into a computer text file without the help of an operator. Moreover, the problem is to perform this operation at an economical cost with limited computer resources in terms of processor and memory.
In this paper, a multiple sensor integration technique with neural network learning algorithms is presented which can enhance the reading accuracy of the hand-written numerals. Many document reading applications involve hand-written numerals in a predetermined location on a form, and in many cases, critical data is redundantly described. The amount of a personal check is one such case which is written redundantly in numerals and in alphabetical form. Information from two optical character recognition modules, one specialized for digits and one for words, is combined to yield an enhanced recognition of the amount. The combination can be accomplished by a decision tree with "if-then" rules, but by simply fusing two or more sets of sensor data in a single expanded neural net, the same functionality can be expected with a much reduced system cost. Experimental results of fusing two neural nets to enhance overall recognition performance using a controlled data set are presented.
Office Automation by electronic text processing has not reduced the amount of paper used for communication and storage. The present boom of FAX-Systems proves this tendency. With this growing degree of office automation the paper-computer interface becomes increasingly important. To be useful, this interface must be able to handle documents containing text as well as graphics, and convert them into an electronic representation that not only captures content (like in current OCR readers), but also the layout and logic structure. We describe a system for the analysis of business letters which is able to extract the key elements of a letter like its sender, the date, etc. The letter can thus for instance be stored in electronic archival systems, edited by structure editors, or forwarded via electronic mail services. This system was implemented on a Symbolics Lisp machine for the high level part of the analysis and on a VAX for the low and medium level processing stages. Some practical results are presented and discussed. Apart from this application our system is a useful testbed to implement and test sophisticated control structures and model representations for image understanding.
This paper describes the optical hand-held scanner designed for use as a data entry device, as well as for the Kurzweil reading machine for the blind. This device has numerous advantages over other scanners introduced into the market, chief among them are its high resolution, one-inch field of view, bidirectional capabilities, compact block-shape and high scanning speed. The device allows the user to scan single sheets as well as bound documents and magazines. During its three-year development, the major emphasis has been on refining the design to make the device easy to use and simple to assemble. This paper addresses the technical issues associated with the human-factors considerations that led us to this design.
Complete Document Recognition is the logical objective of the Optical Character Recognition (OCR) industry. Complete Document Recognition (CDR) is the conversion of paper documents or electronic document images into the optimum computer-usable format for the user's application. There are many components to doing CDR and these are described and defined. Recently developed systems are beginning to approach CDR. The future developments needed to achieve CDR are described.
The most important points in the development of an OCR system are the font independence and the ability to read free layout text. The feature extraction algorithm based on contour tracing generates size invariant geometrical and topological features which make the recognition as font independent as possible. In our OCR system (Recognita) these features are arranged in a tree structure which enables fast classification to be done. The character and line finding algorithm is designed to meet the second requirement including the recognition of proportional spacing, ligatures, kerning and automatic separation of graphics and text.
The image scanner with the CCD sensor has been popular in recent years. The scanner basically has several major key modules, namely the mechanical structure/document handling system, the illumination system, electronic controller, sensor module and the optics. As image scanners have developed in the past few years, we have seen many remarkable optical system designs. I will review them with you here:
The first steps have been taken towards fully electronic photo reproduction on the desktop: a. Desktop scanning for photography is continuing to ride favorable price-performance curves. b. Desktop photo editing software has been introduced and demonstrates promise. c. Users are beginning to become educated about the promise of doing photographic reproductions on the desktop. The challenge has become clearer for applications developers. We must improve reproduction quality and throughput to provide cost justification for doing photos on the desktop. We must also promote user confidence in desktop photo reproduction so that high-end photo reproductions can be attempted. Additional progress needs to made in several areas: a. Focus on quality capture, tone manipulation, and output as the key issues. b. Special effects and gray paint must take a temporary back seat. c. Picture handling software products must ensure the best photographic reproduction possible.
No milestone has proven as elusive as the always-approaching "year of the LAN," but the "year of the scanner" might claim the silver medal. Desktop scanners have been around almost as long as personal computers. And everyone thinks they are used for obvious desktop-publishing and business tasks like scanning business documents, magazine articles and other pages, and translating those words into files your computer understands. But, until now, the reality fell far short of the promise. Because it's true that scanners deliver an accurate image of the page to your computer, but the software to recognize this text has been woefully disappointing. Old optical-character recognition (OCR) software recognized such a limited range of pages as to be virtually useless to real users. (For example, one OCR vendor specified 12-point Courier font from an IBM Selectric typewriter: the same font in 10-point, or from a Diablo printer, was unrecognizable!) Computer dealers have told me the chasm between OCR expectations and reality is so broad and deep that nine out of ten prospects leave their stores in disgust when they learn the limitations. And this is a very important, very unfortunate gap. Because the promise of recognition -- what people want it to do -- carries with it tremendous improvements in our productivity and ability to get tons of written documents into our computers where we can do real work with it. The good news is that a revolutionary new development effort has led to the new technology of "page recognition," which actually does deliver the promise we've always wanted from OCR. I'm sure every reader appreciates the breakthrough represented by the laser printer and page-makeup software, a combination so powerful it created new reasons for buying a computer. A similar breakthrough is happening right now in page recognition: the Macintosh (and, I must admit, other personal computers) equipped with a moderately priced scanner and OmniPage software (from Caere Corporation) can recognize not only different fonts (omnifont recogniton) but different page (omnipage) formats, as well.