A new smart optical sensor, the MAPP2200 is presented. It is based on four years of practical experiences with an earlier line sensor device, the LAPP1100. Using a 256*256 array of photo diodes, the new device is capable of capturing a full image. The chip includes circuitry for A/D-conversion and digital image processing. The processor, being a line-parallel SIMD machine, handles line data at a rate of 4 MHz. The device performs common early vision tasks such as filtering, edge detection, histogramming, and correlation at rates of 10 - 100 frames per second. Simpler tasks such as binary template matching can be performed at more than 1000 frames per second. The paper covers both hardware and software aspects of the new device.
From its inception, the iWarp microprocessor was designed to be used for parallel computing. The iWarp processor (cell) is comprised of a computation agent and a communication agent which operate independently. Both synchronous and asynchronous communication are supported. A processor may have up to 8 asynchronous data movement operations proceeding simultaneously through the spooling (DMA) mechanism. Independently, the computation agent may synchronously move up to 4 words (2 reads, 2 writes) between the cell and the interconnecting pathways in each instruction. The pathways are a set of 4 physical interconnects providing 40 MBytes/sec bidirectional communication each (320 MBytes/sec/cell) to adjacent cells. Local memory bandwidth for the computation agent is 160 MBytes/sec. Communication overhead costs are critical for efficient parallel computing. iWarp's communication is based on connections which establish the desired topology once--the connections remain in place until removed by the program. Multiple (logical) connections may share a single pathway. Communication over established connections incur little or no overhead. We discuss the software tools to help build efficient programs both at the single cell level (C, F77 compilers) and at the array level (Apply/Adapt, Assign, C*, etc.). Using the communication features of iWarp, we present measured performance on some frequently used data movement operations (scatter, gather, broadcast, transpose).
International Standards Organization (ISO) Motion Picture Expert Group (MPEG) technology enables computers, communication, and consumer electronics products to use high-quality digital video and is expected to play a critical role in the emerging multimedia movement. Real-time MPEG performance can be achieved in a single chip with commercial VLSI technology, but requires high-performance buses, memory bandwidths, and arithmetic capabilities. For example, the C-Cube MPEG chip consists of a general purpose processor and dedicated image coprocessors, which implement the ISO MPEG standard for a range of quality and resolution levels. Software can be tailored to specific applications and environments of MPEG. There is special hardware support for discrete cosine transforms, subpel resampling, variable length codes, rounding and clipping. The built-in DRAM controller interfaces directly (without glue chips) to standard DRAMs and utilizes fast page mode. The video bus uses International Radio Consultative Committee 601 format (CrYCbY 4:2:2). Interrupt channels handle the movement of data between DRAM and video bus, and between DRAM and bitstream bus. Multiple chips can be combined for higher resolution applications. A fast algorithms for DCT reduces the number of multiplications.
This paper will focus on the architectures of VLSI programmable processing components for image computing applications. TI, the maker of industry-leading RISC, DSP, and graphics components, has developed an architecture for a new-generation of image processors capable of implementing a plurality of image, graphics, video, and audio computing functions. We will show that the use of a single-chip heterogeneous MIMD parallel architecture best suits this class of processors--those which will dominate the desktop multimedia, document imaging, computer graphics, and visualization systems of this decade.
The wavelet transform provides a new method for signal/image analysis where high frequency components are studied with finer time resolution and low frequency components with coarser time resolution. It decomposes a scanned signal into localized contributions for multiscale analysis. This paper presents a systolic architecture which can compute the discrete wavelet transform (DWT) in an efficient manner. When the number of data points windowed in the input is N equals 2m, our DWT systolic architecture is composed of m layers of identical 1-dimensional arrays, which compute the high-pass and the low-pass filtered components simultaneously. Input data string can enter and be processed 'on-the-fly' continuously at the rate of one data point per clock period T. The computation time for a large number of successive DWT problems is NT per DWT.
The Proteus architecture is a highly parallel MIMD, multiple instruction, multiple-data machine, optimized for large granularity tasks such as machine vision and image processing The system can achieve 20 Giga-flops (80 Giga-flops peak). It accepts data via multiple serial links at a rate of up to 640 megabytes/second. The system employs a hierarchical reconfigurable interconnection network with the highest level being a circuit switched Enhanced Hypercube serial interconnection network for internal data transfers. The system is designed to use 256 to 1,024 RISC processors. The processors use one megabyte external Read/Write Allocating Caches for reduced multiprocessor contention. The system detects, locates, and replaces faulty subsystems using redundant hardware to facilitate fault tolerance. The parallelism is directly controllable through an advanced software system for partitioning, scheduling, and development. System software includes a translator for the INSIGHT language, a parallel debugger, low and high level simulators, and a message passing system for all control needs. Image processing application software includes a variety of point operators neighborhood, operators, convolution, and the mathematical morphology operations of binary and gray scale dilation, erosion, opening, and closing.
The proposed architecture is a logical design specifically for image processing and other related computations. The design is a hybrid electro-optical concept consisting of three tightly coupled components: a spatial configuration processor (the optical analog portion), a weighting processor (digital), and an accumulation processor (digital). The systolic flow of data and image processing operations are directed by a control buffer and pipelined to each of the three processing components. The image processing operations are defined by an image algebra developed by the University of Florida. The algebra is capable of describing all common image-to-image transformations. The merit of this architectural design is how elegantly it handles the natural decomposition of algebraic functions into spatially distributed, point-wise operations. The effect of this particular decomposition allows convolution type operations to be computed strictly as a function of the number of elements in the template (mask, filter, etc.) instead of the number of picture elements in the image. Thus, a substantial increase in throughput is realized. The logical architecture may take any number of physical forms. While a hybrid electro-optical implementation is of primary interest, the benefits and design issues of an all digital implementation are also discussed. The potential utility of this architectural design lies in its ability to control all the arithmetic and logic operations of the image algebra's generalized matrix product. This is the most powerful fundamental formulation in the algebra, thus allowing a wide range of applications.
Digital optic-electronic system presented here can be used as a general purpose programmable multi-processor computer. The system can be used in single instruction multiple data as well as multiple instruction multiple data modes. The system contains 64 simple processing elements executing logic operations and each PE can communicate with its four neighbors using mesh interconnect network. To illustrate the multiprocessor performance, morphological image processing and distortion invariant pattern recognition is presented.
Work is in progress towards an international standard in the area of Image Processing and Interchange. This paper gives a brief outline of the background to this standardization effort, indicates its intended areas of application, and discusses in detail the abstract models on which the application programmer's interface and image interchange facility are based. The data types and operator model that form the basis of this abstract imaging model are presented and a number of general points made.
The Programmer's Imaging Kernel System (PIKS) is an application program interface (API) for image processing. It is one of three parts of a standard for Image Processing and Interchange being developed by the International Standards Organization (ISO) and the International Electrotechnical Commission (IEC). This paper presents an overview of the API; companion papers discuss the imaging architecture and image interchange parts of the standard. PIKS contains a rich set of operators, tools, and utilities. PIKS operators are functional elements that perform manipulations of images or of data objects extracted from images in order to enhance, restore, or assist in the extraction of information from images. These operators range from primitive operators such as convolution and histogram generation to complex, higher level operators such as adaptive histogram equalization and texture feature extraction. PIKS tools are elements that create data objects to be used by PIKS operators, e.g., the generation of filter transfer functions. PIKS utilities are elements that perform basic mechanical implementation tasks such as extracting pixels from an image. PIKS provides a fundamental operator model that supports match point translation of images prior to processing, image-related region-of-interest processing control, image/operator coordinate index assignment, and the ability to define reusable chains of operators.
This paper gives a technical description of the Image Interchange Facility (IIF), which comprises both a formate definition and a functional gateway specification. IIF is a part of the first International Image Processing and Interchange Standard (IPI), which is under elaboration by ISO/IEC JTC1/SC24. This paper reflects the related committee work performed up until January 1992. Considering the deficiencies and drawbacks of existing formats and current practices in exchanging digital images, the need for a new and more general approach to image interchange can be seen. This paper describes the requirements and design principles of the IIF data format and the IIF gateway. Furthermore, it explains the relation to the reference model for open communication (OSI) as well as the relation to the other parts of the IPI standard.
This paper presents an overview of the X Image Extension (XIE) proposal and its role in facilitating scalability of heterogeneous, image data management systems. The proposal is for a standard extension to the X11 Window System to provide applications with support for visually interactive image enhancement and display operations.
This is an overview of TIFF 5.0, which is formally specified in Tag Image File Format Specification, Revision 5.0 FINAL, An Aldus/Microsoft Technical Memorandum, 8/8/88, hereafter called 'the specification'. This note interprets the content of a TIFF file itself, not the content of the specification. One of the great things about the Aldus/Microsoft TIFF specification is that it contains much tutorial and historical information. One of the problems with the document is that this wealth of information obscures the compliance requirements for writers and readers. TIFF revision 6.0--although imminent--is not yet standardized and therefore is not addressed by this note.
We describe the Image Understanding Environment (IUE) designed by the IUE committee consisting of members from General Electric, Stanford University, Columbia University, University of Massachusetts, Amerinex AI Inc, Georgia Tech, SRI International, Advanced Decision Systems, University of Southern California and University of Washington. The primary purpose of the IUE is to facilitate exchange of research results within the Image Understanding community. The IUE will serve as a conceptual standard for IU data models and algorithms and will facilitate code sharing and performance evaluation of new techniques. It will also help in tracking progress in algorithm improvements. Object-oriented principles are used in our approach to the design of the IUE. The overall specification of IUE objects consists of the specifications of classes and class hierarchies for various IU concepts such as: images, image features, geometric features, curves, surfaces, 3D objects, sensors, etc. This paper discusses the design details of IUE curve objects, the motivation behind the object choices, and the class hierarchies.
The design and implementation of imaging algorithms is a growing burden to both application developers and suppliers of computer systems and specialized processors. While programmer's interfaces are being standardized to address some of these problems, these systems still necessitate extensive development efforts. Traditional languages do not provide productive environments for approaching such efforts, lacking support of constructs found in common imaging expressions. An expression language was designed around operators and semantics typically found in imaging algorithms. The language uses a notation that closely models classical discrete math. This notation can be compiled into executable code and also allows specific optimizations for specialized hardware. In the expression language, algorithms can be implemented in a form close to that found in imaging texts--taking advantage of the elegance of this short hand notation which often requires many additional statements in a conventional programming language. The current implementation of the expression language generates both C source code and LATEX. The LATEX code provides an unambiguous type set expression identical to the original mathematical description. For this reason, the ANSI X3H3.8 committee has voted to use this notation as a specification for 'man pages' in the PIK standard. Additionally, an interactive programming environment is being developed for the expression language, demonstrating its utility as an end user tool.
The design of portable image processing algorithms depends on the availability of standard specification languages. In many cases, such specification languages have taken the form of subprogram libraries. In this paper, we discuss a different approach to language standards, namely, the use of mathematical system, an image algebra, for specifying image processing algorithms. The AFATL image algebra, capable of specifying all finite gray level image processing algorithms, provides a variety of mathematical tools with which to manipulate images at a high level. This paper discusses the unique benefits of using such a mathematical system as a common interface specification rather than using typical subprogram libraries, and presents the basic operations and operands of the AFATL image algebra. In addition, we look closely at an imbedding of the image algebra into the Ada programming language. This imbedding provides the basis for a portable high-level image processing language. Benefits and drawbacks of both an Image Algebra Ada (IAA) translator and an Image Algebra Interpreter (IAI) for a sublanguage of IAA are discussed. We close with an analysis of prospects for future use of image algebra in algorithm specification.
A common image processing hardware configuration consists of fast, special-purpose hardware attached to a general-purpose computer. The special-purpose hardware performs the computationally-intensive processing. This works well for algorithms that have been hand- coded for the special hardware, and in situations where a complete compiler for the attached processor is available. However, the development of new algorithms requires composition of the basic, hand-coded operators and this suffers from problems of inefficiency in both memory usage and loop overhead. Chaining mechanisms allow the delayed execution of operations, with the potential for optimizing combinations of operations to reduce this inefficiency. Modern programming languages, such as C++, allow attractive implementations of chaining as the programming interface can be natural and intuitive with little syntactical overhead for the chaining constructs. The paper will discuss the philosophy and implementation of chaining and a method for building optimized, chained image processing constructs without a special-purpose language parser or compiler.
A uniform interface for the data exchange between image segmentation and high-level image analysis is presented, termed here an 'iconic-symbolic interface'. The interface is specified as a class in an object-oriented programming environment. The term 'iconic processing' is contrasted to 'iconic data structures.' Symbolic processing is separated from iconic processing by the use of explicitly represented knowledge about the task domain. Many segmentation algorithms may be performed independent of the task domain. It is shown that the same holds for the recovery of depth and surface information by shape from shading or stereo and for the detection of motion. Several data structures for the representation of the results of segmentation are compared. The new class 'segmentation object' (i.e., the data structure and the required operations on it) is defined as a superset of the other proposed data structures. It allows for a uniform representation for 2-D and 3-D image segmentation and for motion detection. The interface to symbolic processing is defined by a machine-independent external representation of the segmentation object. Compactness is obtained by binary storage. International standardization of low-level image preprocessing and of an image interchange format is in process. A future standard can cooperate with the external representation of segmentation objects.
Functional programming is a style of programming that avoids the use of side effects (like assignment) and uses functions as first class data objects. Compared with imperative programs, functional programs can be parallelized better, and provide better encapsulation, type checking, and abstractions. This is important for building and integrating large vision software systems. In the past, efficiency has been an obstacle to the application of functional programming techniques in computationally intensive areas such as computer vision. We discuss and evaluate several 'functional' data structures for representing efficiently data structures and objects common in computer vision. In particular, we will address: automatic storage allocation and reclamation issues; abstraction of control structures; efficient sequential update of large data structures; representing images as functions; and object-oriented programming. Our experience suggests that functional techniques are feasible for high- performance vision systems, and that a functional approach simplifies the implementation and integration of vision systems greatly. Examples in C++ and SML are given.
Adapt is a data parallel little language for both local and global image processing on parallel computers. It is architecture independent: it hides the distribution of data, the number of processors and their topology, and even the existence of multiple processes from the programmer. The programs Adapt generates are efficient, even as compared with hand code, are easy to compile for MIMD architectures, and are easy to write. Adapt presents the programmer with three underlying concepts: the idea of the split and merge programming model, raster order per-pixel processing, and the scanline/transpose method. These three concepts make it possible to implement a wide variety of image processing algorithms, including histogram, uniform convolution, run-length encoding, image warping, connected components analysis, and two-dimensional fast Fourier transform. Performance of Adapt on Sun/Unix workstations, the Carnegie Mellon Warp machine, and the Carnegie Mellon - Intel Corporation iWarp computer will be presented. Adapt is being used in an implementation of the emerging ISO/ANSI standard Programmer's Imaging Kernel System. The implementation strategy of the library will be discussed.
VPL 1.0 is a visual programming language for image processing. It uses a lazy functional programming paradigm, expressed with a box-and-arc representation. In the current version of the system, the image processing functionality is provided by VIEW-Station, an image processing library developed in one of Canon's Japanese laboratories. Some of the notable features of the system are: the program is always 'live'; higher-order functions are allowed; and the visual language interface and evaluation modules are designed to connect easily to (most) C++ or ANSI-C image processing libraries. This paper discusses the advantages and disadvantages of these design decisions. It also discusses some of the resulting implementation issues, and the solutions adopted. Specific topics covered include: the use of higher-order functions in image processing; what type-checking would be desirable for image processing in a visual language environment; what type-checking is feasible when the visual language environment is used as a front-end to a C++ library; and the advantages and disadvantages of having the image processing sub-system fully integrated.
We present a three-part software environment tailored to the areas of computer vision and image processing (CVIP). The environment is designed to provide high performance and ease of use for CVIP researchers implementing algorithms and tasks on parallel systems. Cloner is a software reuse tool that helps a user design parallel algorithms by building on and modifying algorithms from the system library. It takes advantage of the fact that CVIP algorithms are often highly structured and that many algorithms have the same or similar structure. It is being designed as a menu-based, query-based system aimed at reducing the degree to which the user must be concerned with the details of parallel programming. Graph Matcher is a software tool to perform algorithm-to-architecture mapping for image processing algorithms. It consists of a library of known data-dependency structures and of mappings of these structures onto parallel architectures. For the regular graphs that characterize most image processing algorithms, the graph isomorphism used to identify a new algorithm graph as an instance of a library graph is performed in polynomial time. DISC (Dynamic Intelligent Scheduling and Control) is an operating system component that provides a rapid prototyping capability for execution of complex CVIP tasks on partitionable parallel systems. The scheduler addresses the problems of algorithms with execution times that depend on the image data and processing scenarios that vary dynamically based on the input image.
An image processing software architecture developed and implemented for a medium-grain parallel machine is described. A machine with medium-grain parallelism is one that contains multiple processors connected by a high-bandwidth data path. Issues addressed were how to split up the image for processing and with what granularity, how to assign tasks to an arbitrary number of processors, how to coordinate the execution of the processors, and how to perform processing which requires global information about the image (for example, an image histogram). Details of the architecture, design trade-offs, and issues encountered during implementation are presented.
apART reflects the structure of an open, distributed environment. According to the general trend in the area of imaging, network-capable, general purpose workstations with capabilities of open system image communication and image input are used. Several heterogeneous components like CCD cameras, slide scanners, and image archives can be accessed. The system is driven by an object-oriented user interface where devices (image sources and destinations), operators (derived from a commercial image processing library), and images (of different data types) are managed and presented uniformly to the user. Browsing mechanisms are used to traverse devices, operators, and images. An audit trail mechanism is offered to record interactive operations on low-resolution image derivatives. These operations are processed off-line on the original image. Thus, the processing of extremely high-resolution raster images is possible, and the performance of resolution dependent operations is enhanced significantly during interaction. An object-oriented database system (APRIL), which can be browsed, is integrated into the system. Attribute retrieval is supported by the user interface. Other essential features of the system include: implementation on top of the X Window System (X11R4) and the OSF/Motif widget set; a SUN4 general purpose workstation, inclusive ethernet, magneto optical disc, etc., as the hardware platform for the user interface; complete graphical-interactive parametrization of all operators; support of different image interchange formats (GIF, TIFF, IIF, etc.); consideration of current IPI standard activities within ISO/IEC for further refinement and extensions.
The main goal of the Khoros software project is to create and provide an integrated software development environment for information processing and data visualization. The Khoros software system is now being used as a foundation to improve productivity and promote software reuse in a wide variety of application domain. A powerful feature of the Khoros system is the high-level, abstract visual language that can be employed to significantly boost the productivity of the researcher. Central to the Khoros system is the need for a consistent yet flexible user interface development system that provides cohesiveness to the vast number of programs that make up the Khoros system. Automated tools assist in maintenance as well as development of programs. The software structure that embodies this system provides for extensibility and portability, and allows for easy tailoring to target specific application domains and processing environments. First, an overview of the Khoros software environment is given. Then this paper presents the abstract applications programmer interface, API, the data services that are provided in Khoros to support it, and the Khoros visualization and image file format. The authors contend that Khoros is an excellent environment for the exploration and implementation of imaging standards.
VIEW-Station is a workstation-based image processing system which merges the state-of-the- art software environment of Unix with the computing power of a fast image processor. VIEW- Station has a hierarchical software architecture, which facilitates device independence when porting across various hardware configurations, and provides extensibility in the development of application systems. The core image computing language is V-Sugar. V-Sugar provides a set of image-processing datatypes and allows image processing algorithms to be simply expressed, using a functional notation. VIEW-Station provides a hardware independent window system extension called VIEW-Windows. In terms of GUI (Graphical User Interface) VIEW-Station has two notable aspects. One is to provide various types of GUI as visual environments for image processing execution. Three types of interpreters called (mu) V- Sugar, VS-Shell and VPL are provided. Users may choose whichever they prefer based on their experience and tasks. The other notable aspect is to provide facilities to create GUI for new applications on the VIEW-Station system. A set of widgets are available for construction of task-oriented GUI. A GUI builder called VIEW-Kid is developed for WYSIWYG interactive interface design.
A software package to support advanced research in the field of image processing and image analysis must have sufficiently powerful features, especially if it is to be used in a variety of application domains. In this contribution, we present the concepts of the LUCI image processing package, which has been developed in order to meet requirements that have been formulated as a result of previous experiences in our research group. These include support for multi-dimensional images and programming constructs to perform operations on those pixels that are contained in areas of arbitrary shape. In addition, we briefly present support tools for managing this software and its documentation.
LaboImage provides scientists with general purpose as well as specific processing families and tools in a highly interactive environment. The current software results from an evolution reflecting several years of development and experiences. This paper first presents the new X Window / OSF Motif version of LaboImage, as it is seen by the user. It also describes how an image is manipulated in the system, how processing methods are applied and results are visualized. Multiple types of interaction between the user and the system are addressed. The implementation aspects are then detailed. They concern data structures as well as algorithms and interfaces. The data file and descriptor file formate used for strong images is described. The organization in memory of multiple data such as images, vectors, and macros is presented. The source code organization is also discussed. A clear separation between algorithmic and interface parts in the code appears to be very important, in order to allow easy further developments of the system.
An important application of digital image processing is the compression of video sequences by one or two orders of magnitude with minor picture quality degradation. In order to achieve this data compression elaborated algorithms are used. They eliminate both spatial and temporal redundancy by using transform, differential, and variable length coding techniques. Two of these algorithms are the CCITT H.261 algorithm for videotelephony and the ISO MPEG algorithm for CD-ROM motion video. The hardware implementation of these algorithms is a formidable task in view of the number of operations (more than 1GFLOPS) that may be necessary. This paper discusses the compression and decompression of real-time video using a multiprocessor system based on digital signal processors. The system is based on the partition of each picture in horizontal strips which are operated by a local processor unit made by the combination of the TMS320C30 signal processor and an A121 discrete cosine transform processor. In the encoder, each strip processor inputs raw data from a video acquisition module through a common parallel video bus and outputs compressed data to a supervisor module through a common serial supervisor bus. In the decoder, the data flows through an inverse path, i.e., the processors receive data from a supervisor module and transmit data to a display module. All operations within the horizontal strips are independent from each other except when motion estimation is used. In this case, the processing elements have to access regions of the picture that are allocated to neighboring processors. The number of processors is related to the frame rate and the resolution of the image.
Recent practice in image processing is dominated by heuristic methods used to design practical, relevant algorithms. To ensure high efficiency in the design process, the communication between user and computer should be as direct as possible. An interactive software system for image processing is required to fulfill this demand. Interpreter-based systems with high interactivity available on the software market have the drawback of low operation speed. In AMBA/D we combine the performance of a compiler/based system, with the interactivity of an interpreter system. The AMBA/D system is an interactive programming environment with integrated facilities to create, compile, execute, and debug programs. In AMBA/D, a compiler language, direct execution, and programming concept is combined with a collection of high-level image processing procedures. The design of a special compiler language was necessary because existing computer languages like FORTRAN, C, etc., do not fulfill our requirement of interactivity. The system runs of an IBM-compatible personal computer and can be used with different types of commercially available frame grabbers.
Image compression is used to handle large volume of digitized image data in order to minimize the time and cost required to store and transfer the digitized data. Image compression is one of the key components in emerging applications such as digital still video cameras, multimedia, color printers, video fax machines, and desktop publishing. This paper will describe the Zoran 031 image compression chip set. The chip set is comprised of the ZR36020 Discrete Cosine Transform (DCT) Processor and the ZR36031 Image Compression Coder/Decoder that work together to perform image compression and expansion. The chip set employs an algorithm for high quality compression of continuous-tone color or monochrome images, similar to the algorithm specified in the Joint Photographic Expert Group standard. The 031 chip set is targeted at cost-sensitive business and consumer applications such as digital still video cameras, color printers, color fax machines, and scanners. The architecture and the coding/decoding algorithm of the chip set as well as the add-in image compression PC board in which it is utilized will be discussed.
The application of computer technology to the automation of visual tasks is a difficult and time consuming process. Until recently, the lack of powerful software tools has required that the developer of computer based imaging applications be capable both of programming computer systems and of conducting imaging research. The characteristics of the development process that should be captured in such an environment are related to human-machine interface issues. This paper reports on the user interface issues that were encountered during the implementation and ongoing development of a commercial product, designed to aid in the construction of image understanding applications