Prolog offers a very different style of programming compared to conventional languages; it can define object properties and abstract relationships in a way that Java, C, C++, etc. find awkward. In an accompanying paper, the authors describe how a distributed web-based vision systems can be built using elements that may even be located on different continents. One particular system of this general type is described here. The top-level controller is a Prolog program, which operates one, or more, image processing engines. This type of function is natural to Prolog, since it is able to reason logically using symbolic (non-numeric) data. Although Prolog is not suitable for programming image processing functions directly, it is ideal for analysing the results derived by an image processor. This article describes the implementation of two systems, in which a Prolog program controls several image processing engines, a simple robot, a pneumatic pick-and-place arm), LED illumination modules and a various mains-powered devices.
In a typical machine vision algorithm, a few complex operators may account for a significant fraction of the overall execution time. In addition to these, there are usually many simple 3x3 operators which are used over and over. Although individually fairly fast, these operators sometimes dominate the overall execution time because they are used so many times. And because they are fairly fast, little attention is paid to speeding them up. The many previous papers describing the SKIPSM paradigm have concentrated mainly on large-neighborhood operations, because speed improvements are the most dramatic in such cases. In this paper, SKIPSM implementations of some common 3x3 linearly-separable and nonseparable operators are considered. Examples include low-pass (i.e., blurring and smoothing) filters, band-pass filters, high-pass (i.e., edge detector) filters, and gradient operators. Speed comparisons between conventional implementations and SKIPSM implementations are presented.
Characterizing the surface texture of objects or moving webs is sometimes important in identifying and/or determining the quality of products. Fourier analysis is often used for this in laboratory situations, but inspection algorithms using the FFT are too slow for many production situations, even when implemented on fast computers. An alternative technique pioneered in the 1970s by Haralick  operates in the time domain and uses grey-level cooccurrence matrices (GLCMs) as a first step toward obtaining useful measures characterizing textures. Although the GLCM approach is much less computationally-intensive than the FFT, it nonetheless requires massive amounts of calculation. Most of this computation time is spent in stepping through the input image and compiling the matrices themselves. Therefore, if the calculation time for these matrices could be reduced, the GLCM technique would become more practical. This paper applies the SKIPSM paradigm to the calculation of GLCMs, and provides execution times for this implementation.
Hitherto, it has been assumed that an industrial Machine Vision systems is constructed as an integrated unit, with the camera, image processing unit and control/display console being located close to one another and to the object/scene being inspected. For several reasons, it may be helpful to separate them, so that only the camera and its associated lighting units are located on the factory floor, while other equipment, such as computers and user terminals, is
located elsewhere, out of harm's way. We describe three systems that allow multiple cameras and several separate image processing engines, to be controlled remotely from a single "intelligent" device, or a user working via a standard web browser. The paper describes and compares several different approaches to building such a system. Links to the networked vision systems mentioned here are provided in an accompanying web site.
As the cost/performance Ratio of vision systems improves with time, new classes of applications become feasible. One such area, automotive applications, is currently being investigated. Applications include occupant detection, collision avoidance and lane tracking. Interest in occupant detection has been spurred by federal automotive safety rules in response to injuries and fatalities caused by deployment of occupant-side air bags. In principle, a vision system could control airbag deployment to prevent this type of mishap. Employing vision technology here, however, presents a variety of challenges, which include controlling costs, inability to control illumination, developing and training a reliable classification system and loss of performance due to production variations due to manufacturing tolerances and customer options. This paper describes the measures that have been developed to evaluate the sensitivity of an occupant detection system to these types of variations. Two procedures are described for evaluating how sensitive the classifier is to camera variations. The first procedure is based on classification accuracy while the second evaluates feature differences.
In this paper the authors present two methods to extract handwritten components from a personal bank check. The first technique utilizes a blank check image designated as the reference image and a filled-in check image designated as the sample image. These checks differ only at specific regions of the check reserved for insertion of handwritten or machine printed components. A technique called morphological subtraction is used to extract the user-entered components. However, because the background patterns of reference image and sample images will not be identical, an affine transform is used to ensure better alignment between the two images. To eliminate the need for a blank check of every kind, a second method is proposed that uses grayscale morphology to extract the handwritten information. Because some background features are similar to the user-entered information, some residual background information is likely to be present in the processed image; however these can be removed with post processing.
Optical character recognition of machine-printed documents is an effective means for extracting textural material. While the level of effectiveness for handwritten documents is much poorer, progress is being made in more constrained applications such as personal checks and postal addresses. In these applications a series of steps is performed for recognition beginning with removal of skew and slant. Slant is a characteristic unique to the writer and varies from writer to writer in which characters are tilted some amount from vertical. The second attribute is the skew that arises from the inability of the writer to write on a horizontal line. Several methods have been proposed and discussed for average slant estimation and correction in the earlier papers. However, analysis of many handwritten documents reveals that slant is a local property and slant varies even within a word. The use of an average slant for the entire word often results in overestimation or underestimation of the local slant. This paper describes three methods for local slant estimation, namely the simple iterative method, high-speed iterative method, and the 8-directional chain code method. The experimental results show that the proposed methods can estimate and correct local slant more effectively than the average slant correction.
With the cost of image acquisition and processing hardware decreasing substantially, consumer applications utilizing machine vision are becoming more feasible. Automotive vision systems represent one emerging application area and offer the potential of significant enhancements to automotive safety. However, the relative lack of lower-cost and higher-performance cameras limits the use of vision technology in cars. Camera acquisition speed, sensitivity and dynamic range issues are especially critical due to the totally unconstrained illumination for this type of application. A successful vision system must be highly reliable under direct sunlight and near-total darkness. Conditions of extreme contrast occur primarily during the day when deep shadows are cast across part of a scene being imaged by the camera. This paper provides a survey of existing camera hardware and discusses the limitations of existing hardware. Performance criteria requirements for different automotive applications will also be presented.
The SKIPSM (Separated-Kernel Image Processing using Finite-State Machines) paradigm was originally developed about 1990 as a technique for increasing the speed and versatility of pipelined image-processing hardware. As general-purpose computers became faster (although still much slower than dedicated hardware), it became clear that the most important application of SKIPSM would be for speeding up software image-processing programs running on PCs. This paper therefore concentrates on software implementations written in the C language. Because the SKIPSM paradigm is radically different from conventional image processing algorithms, the paper begins with a
general overview of the method. This is followed by some examples selected from the wide range of available SKIPSM operations. Finally, execution-speed comparisons for some of these examples are presented.
This paper develops and extends the SKIPSM (Separated-Kernel Image Processing using finite-State Machines) paradigm to provide fast and efficient grey-scale template matching on ordinary desktop computers. An earlier paper, published in 1994, applied the SKIPSM theory to a limited version of grey-scale template matching, but the specific applications used LUTs (lookup tables) and pipelined hardware, as with all SKIPSM papers of that era. In this paper, direct software implementations of the finite-state machines are used, rather than LUTs, because computers with pipelined instruction streams and vector data structures lose most of their speed advantages when using LUTs.
The grassfire transform (GT) maps a binary image into a grey-level image in such a way that the output grey level of each interior pixel of each individual blob is proportional to the distance of that pixel from the blob boundary. While potentially very useful, the GT has seen limited application because of the many computational steps required to calculate it, resulting in long execution times. An earlier paper, published in 1994, presented a SKIPSM implementation of the GT in which six stages of burning were carried out in a single pass. That implementation used LUTs (lookup tables) and pipelined hardware, as with all SKIPSM papers of that era. In this paper, direct software implementations of the finite-state machines are used, rather than LUTs, because computers with pipelined instruction streams and vector data structures lose most of their speed advantages when using LUTs.
This paper presents an updated version of a general method for carrying out binary template matching, which is useful for image analysis in general and automated visual inspection and quality control in particular. In a series of 23 papers, image processing implementations based on the SKIPSM (Separated- Kernel Image Processing using finite-State Machines) paradigm have been shown to be faster or much faster than conventional implementations. One of the earliest of these papers, published in 1994, was devoted to binary template matching of various types. As with all the papers of that era, the theory was presented in general form but the specific applications used LUTs (lookup tables) and pipelined hardware. The results were impressive - templates 35x35 or even larger could be executed in the same time as the identical hardware, programmed conventionally, could execute a 3x3 template. This paper develops and extends the same basic approach to provide fast and highly efficient binary template matching on ordinary desktop computers. This implementation does not use LUTs, because computers with pipelined instruction streams and vector data structures perform relatively slowly when using LUTs.
Recognition of license plate images is the topic of this paper. The major issue in this problem is the accurate extraction of the license plate character string from varying backgrounds using processing techniques that are reasonably fast. The images are also characterized by non-uniform illumination. Recognition of the string is relatively straightforward, if the extraction process has been correctly designed. The authors present three different approaches to extraction and their study revealed that the combination of gray-scale morphology with a log gray-scale transform provided accurate extraction of the string. Recognition studies with 700 images captured during day and night periods indicated an overall acceptance rate of nearly 90% with .7% confusion.
The SKIPSM paradigm gives very fast execution of binary morphology operations with large arbitrary SEs. Hardware- based applications using lookup tables to implement the FSMs have been in use for almost a decade. More recently, software-based applications have benefited form comparable speed increases. This paper provides speed comparisons between software implementations using lookup tables and those using direct implementations of the FSMs, for a range of SE sizes and shapes.
A common machine vision task is to verify that a product has been properly fabricated or assembled. In this paper, a vision system is described for confirming that a type of gear has been processed properly. The main problem in this application is the relatively large depth of the gear which result in a more complex image than the desired silhouette of the pat. The resulting image shows a portion of the inner wall due to the varying magnification because some points on this wall are closer to the lines than others. Employing telecentric optics, however, greatly simplifies the problem since only light rays parallel to the optical axis create the image so that a good silhouette of the part is obtained. Gear teeth can then be isolated using mathematical morphology techniques to verify that the part has been properly broached.
The SKIPSM paradigm offers fast execution of a very wide range of binary, grey-scale, 3D, and color image-processing applications. In this paper the finite-state-machine approach is applied to one of the 'classical' problems of binary image processing: connected-component analysis. Execution-time results are presented, and compared for several examples to execution times for the very-efficient conventional method based on analysis of run-length-encoded data.
The Crosshead Inspection System, CIS, utilizes machine vision technology for on-line inspection of a diesel engine component - a crosshead. The system includes three functional modules. 1) Part handling subsystem - presents parts for inspection and accepts or rejects them based on signals for the image analysis software. 2) Image acquisition hardware - Optics, light sources and two video cameras collect images of inspected parts. 3) Image analysis software - analyzes the images and sends pass/fail decision signals to the handling subsystem. The CIS acquires and inspects two images of each part. The upper camera generates an image of the part's top surface, while the lower camera generates an image of the so-called 'pockets' of the lower half. Both images are acquired when a part-in-place signal is received from the handling system. The surface inspection camera and light source are positioned at opposed low angles relative to the surface. Irregularities manifest themselves as shadows on the surface image. These shadows are detected, measured and compared to user specifications. The pocket inspection detects the presence of tumbler stones. The contrast of these stones is enhanced with circularly polarized lighting and imaging. The graphical user interface of the CIS provides easy setup and debugging of the image processing algorithms. A database module collects, archives and present part inspection statistics to the user. The inspection rate is sixty parts per minute.
The objective of the system is inspection of individual pieces of stemware for geometry defects and glass imperfections. Cameras view stemware from multiple angles to increase surface coverage. The inspection images are acquired at three stations. The first inspects internal glass quality, detecting defects such as chemical residue and waviness. The second inspects the rim, geometry of the stemware body and stem, and internal defects such as cracks. The third station inspects the stemware base for geometrical and internal defects. Glass defects are optically enhanced through the use of striped pattern back lighting combined with morphological processing. Geometry inspection is enhanced through the use of converging illumination at the second station, while the third station utilizes large field true telecentric imaging. Progressive scan cameras and frame grabbers capable of simultaneous image capture are used at each station. The system software comprises six modules: system manager, I/O manager, inspection module for each station, and stemware sorting and logging module. Each module is run as a separate application. Applications communicate with each other through TCP/IP sockets, and can be run in a multi-computer or single-computer setup. Currently two Windows NT workstations are used to host the system.
A vision system to gauge two types of automotive parts has been developed. One of the part types is a power steering connector in which the depth and width of the groove, and the distance between the start of the groove and the end of the power steering line are gauged. For the second type of part, a crimped connector attached to a brake hose, measurements of interest are the two diameters of the crimp and the bell length where the hose is inserted into the connector. A standard video camera is used to acquire the image of a back-illuminated image of the part which is digitized and captured with a frame grabber. The basic hardware to accomplish the gauging tasks consists of a standard video camera, light source, frame grabber and industrial personal computer. In order to minimize hardware costs, a standard 50 mm C-mount camera lens and extension tube was used with the video camera. Consideration had been made to use more expensive telecentric optics so that parts placement would not cause a change in magnification with a resulting loss of accuracy. With the 50 mm lens, however, magnification effects were lessened due to the greater standoff distance between camera and part. For image acquisition, a low-cost PCI-bus frame grabber-card was chosen. With this type of card, high-speed video capture is possible due to the very wide bandwidth of the PCI bus. Combined with a Pentium-based PC, rapid image acquisition and analysis can be performed so that every part can be gauged at full production rates. Since the gauging rate exceeds the production rate by a significant factor, a single computer and frame grabber with camera multiplexer can process data in real time from up to four measurement stations simultaneously.
This paper describes the benchmarking of image processing algorithms using high-performance workstations and personal desktop computers. For the various platforms evaluated which included machines from Sun, SGI, Apple, and Gateway, compiler options were varied to obtain the fastest execution times. Algorithms evaluated included typical image processing operations such as derivatives, logical operations, morphology, subtraction, median filter, and the new SKIPSM approach. Data were collected using the different platforms and are presented here in tabular form. The results indicate that the latest generation of personal computers have processing capabilities that are similar to UNIX-based work stations.
For the past three summers, students at the University of Michigan-Dearborn have been participating in the development and testing of various aspects of machine vision systems with support from the National Science Foundation under the Research Experiences for Undergraduates (REU) program. Much of the work has involved algorithm development since useful work can be performed with a fairly modest programming background. Benchmarking of various algorithms is a related activity that has seen much student participation. To a lesser extent, illumination and optics work has also been performed for the development of experimental setups and actual implementation of vision systems. Over the three-year duration of the program, a total of 34 students participated in these activities. While many of the participants were full-time students at the University of Michigan-Dearborn, others were from engineering colleges over a diverse geographical area. Summaries of a number of the projects is included here. It may be noted that the National Science Foundation has established the REU program to encourage more students to obtain advanced degrees in science and engineering and ultimately to pursue careers in research and development.
2D Gaussian blur operations are used in many image processing applications. The execution times of these operations can be rather long, especially where large kernels are involved. Proper use of two properties of Gaussian blurs can help to reduce these long execution times: (1) Large kernels can be decomposed into the sequential application of small kernels. (2) Gaussian blurs are separable into row and column operations. This paper makes use of both of these characteristics and adds a third one: (3) The row and column operations can be formulated as finite-state machines to produce highly efficient code and, for multi-step decompositions, eliminate writing to intermediate images. This paper shows the FSM formulation of the Gaussian blur for the general case and provides examples. Speed comparisons between various implementations are provided for some of the examples. The emphasis is on software implementations, but implementations in pipelined hardware are also discussed. Straightforward extensions of these concepts to 3- and higher-dimensional image processing are also presented. Implementation techniques for DOG (Difference-of-Gaussian filters) are also provided.
The earlier papers on SKIPSM (separated-kernel image processing using finite state machines) concentrated mainly on implementations using pipelined hardware. Because of the potential for significant speed increases, the technique has even more to offer for software implementations. However, the gigantic structuring elements (e.g., 51 by 51 in one pass) readily available in binary morphology using SKIPSM are not practical in gray-level morphology. Nevertheless, useful structuring element sizes can be achieved. This paper describes two such applications: dilation with a 7 by 7 square and a 7 by 7 octagon. Previous 2-D SKIPSM implementations had one row machine and one column machine. Two of the implementations described here follow this pattern, but the other has four machines: row, column, and the two 45-degree diagonals. In operation, all of these are one-pass algorithms: The next pixel is 'fetched' from the input device, the two (or four) machines are updated in turn, and the resulting output pixel is written to the output device. All neighborhood information needed for processing is encoded in the state vectors of the finite-state machines. Therefore, no intermediate image stores are needed. Furthermore, even the input and output image stores can be eliminated if the image processor can keep up with the input pixel rate. Comparisons are provided between these finite-state-machine implementations and conventional implementation of the 2-step and 4-step decompositions, all based on the same structuring elements.
Low-cost PC-based machine vision systems have become more common due to faster processing capabilities and the availability of compatible high-speed image acquisition and processing hardware. One development, which is likely to have a very favorable impact on this trend, is enhanced multimedia capabilities present in new processor chips such as Intel MMX and Cyrix M2 processors. Special instructions are provided with this type of hardware which, combined with a SIMD parallel processing architecture, provides a substantial speed improvement over more traditional processors. Eight simultaneous byte or four double-byte operations are possible. The new instructions are similar to those provided by DSP chips such as multiply and accumulate and are quite useful for linear processing operations like convolution. However, only four pixels may be processed simultaneously because of the limited dynamic range of byte data. Given the inherent limitations with respect to looping in SIMD hardware, nonlinear operations such as erosion and dilation would seem to be difficult to implement. However, special instructions are available for required operations. Benchmarks for a number of image-processing operations are provided in the paper to illustrate the advantages of the new multimedia extensions for vision applications.
A system for 3D gauging of small fibers has been developed for process monitoring. The basic hardware consists of a pair of 2048 linear cameras orthogonally positioned, an IBM PC-compatible Pentium computer with frame grabber, a stepper motor and associated hardware for translating the fiber, a bright-field light source and special optics. The fiber is moved vertically past the two cameras as they scan. the computer acquires each scan line, processes it and then issues control signals to the stepper motor. Several different image processing operations are used to minimize the effects of illumination nonuniformity since fibers will sometimes have low contrast due to their small size. There are two sources of illumination variations, spatial and temporal which are processed independently. Image analysis is performed to provide 3D fiber shape characteristics.
A system for measuring barriers in a color AC plasma panel has been developed. Barriers are used in this type of display to prevent phosphors in cells adjacent to lit cells from being excited which adversely affects color purity. The geometry of the barriers is a significant factor for successful operation of color plasma panels and must be measured to verify that the barriers are within specifications. Barrier height is on the order of several mils with a pitch on the order of about 10 mils. A system developed for spacer measurements was available for this application. However, it did not have sufficient light sensitivity because the barriers reflect light much less efficiently than traditional panels. The original system employed a light section microscope for height measurement. The video amplifier gain was boosted significantly in the frame grabber and frame integration was provided to reduce noise. Finally, background subtraction was provided to remove shading variations associated with the normally insignificant dark current of the CCD sensor. Once a good image had been obtained, morphological processing was performed to reduce noise and centroid calculations were performed to provide an accurate measure of the barrier surface height.
The stability of glass substrates is an important concern for the flat panel display industry. High-resolution displays have very tight geometrical requirements and alignment of the various display components is critical if good performance is to be obtained. Prior to development of manufacturing processes for these displays, it is necessary to determine how glass substrates change during the various processing steps. This paper describes a system to measure electrode patterns before and after critical processing steps for color plasma panels. The electrode patterns, which are made of thin-film gold, are a series of parallel electrodes. In order to measure electrode locations, a vision system consisting of an X-Y stage, a video camera, a frame grabber, and PC-compatible computer was used. Images captured with this setup were processed to minimize the effects of noise and improve accuracy. A gray-scale interpolation technique in which the centroids of the electrodes are calculated was used to enhance measurement resolution.
The Gnu project has provided a substantial quantity of free high-quality software tools for UNIX-based machines including the Gnu C compiler which is used on a wide variety of hardware systems including IBM PC-compatible machines using 80386 or newer (32-bit) processors. While this compiler was developed for UNIX applications, it has been successfully ported to DOS and offers substantial benefits over traditional DOS-based 16-bit compilers for machine vision applications. One of the most significant advantages with Gnu C is the removal of the 640 K limit since addressing is performed with 32-bit pointers. Hence, all physical memory can be used directly to store and retrieve images, lookup tables, databases, etc. Execution speed is generally faster also since 32-bit code usually executes faster and there are no far pointers. Protected-mode operation provides other benefits since errant pointers often cause segmentation errors and the source of such errors can be readily identified using special tools provided with the compiler. Examples of vision applications using Gnu C include automatic hand-written address block recognition, counting of shattered-glass particles, and dimensional analysis.
Illumination-invariant image processing is an extension of the classical technique of homomorphic filtering using a logarithmic point transformation. In this paper, traditional approaches to illumination-invariant processing are briefly reviewed and then extended using newer image processing techniques. Relevant hardware considerations are also discussed including the number of bits per pixel required for digitization, minimizing the dynamic range of the data for image processing, and camera requirements. Three applications using illumination-invariant processing techniques are also provided.
This paper discusses very simple but effective one-dimensional morphological techniques for the identification of primary and secondary peak locations associated with reflected light patterns from glass surfaces. A common optical technique for measuring glass thickness and related properties is to observe light reflected from the glass surfaces. Two reflections can be observed when an appropriate structured light source is used to illuminate a glass surface. A very bright primary reflection associated with the reflection from the front surface will be observed along with a much fainter secondary reflection from the back surface. The secondary reflection is difficult to detect reliably given the large difference in magnitude between the two peaks, the presence of noise, and the varying amounts of overlap between the two peaks that can occur. The methods described in the paper have been implemented successfully for two vision applications using images acquired using standard matrix and linear cameras. The signal is preprocessed using one-dimensional morphological and linear methods to normalize the background and remove noise. Further morphological operations are performed to identify the peaks associated with primary and secondary reflections.
This paper describes the development of a cost-effective and reliable automated method for inspection spacers on the dielectric surface of AC Plasma display panels. The system generates 3D profiles of spacers using a light-section microscope in conjunction with a PC-based vision system. Structured lighting, a video camera and frame grabber are used to capture images for computer analysis.
This paper describes an automated method for classifying defects on the electrode-lines deposited on a flat panel display. These defects are presented to the operator at the semi- automated panel repair station during operator repair. The defect categories include surplus gold between electrodes, excessive gold on the line, not enough gold on the line, broken line, wide line width, gold strain stick, gas bubble, and semitransparent materials. The automated method will eliminate the deficiencies of human visual classification by providing fast accurate and repeatable defect classification. The process will free the operator from much of the work associated with data logging during the repair to enhance operator productivity. The system will classify defects by analyzing selected features in each observed defect with a set of previously defined rules and it also processes video data in real-time during the repair operation. Results can be used to control the manufacturing process to reduce the occurrence of defects or for selection of the proper repair procedure.
Analysis of shattered automotive glass for determining particle size, shape and count is legally required to demonstrate that safety criteria have been met. Manual methods are labor intensive, subjective and prone to errors. This paper presents an approach for automating this analysis. Problems associated with missing boundary segments, can cause serious errors unless corrective measures are applied. Techniques for efficient extraction of particle boundaries and restoration of missing segments are presented.
Discrimination technique between objects with similar colors are presented including traditional color measurement such as Lab and spectral signature. A number of methods have been evaluated for discriminating between different ceramic tile pigments. Ceramic products come in a wide variety with similar colors. Labeling mixups often occur between similar pigments resulting in incorrect products being shipped to customers. Mislabeled products are sometimes installed, since subtle color differences are difficult to perceive, especially with marginal illumination. This will result in direct expenses associated with rectifying these errors as well as customer inconvenience and dissatisfaction.
Multispectral image sequences are one example of a class of image sequences that can be characterized as being spatially invariant. In this class of image sequences, all features are positionally invariant in each image of a given sequence but have varying gray-scale properties. The various features of the scene contribute additively to each image of the sequence but the image formation processes associated with given features have characteristic signatures describing the manner in which they vary over the image sequence. Such sequences can be processed using the simultaneous diagonalization (SD) filter which will generate gray- scale maps of the different image formation processes. The SD filter is based on an explicit mathematical model and can be used to maximize SNR, perform segmentation and provide data compression. A unique property of this approach is that even if several image formation processes occupy a given pixel, they can still be isolated. The gray-scale map associated with each process provides an estimate of the magnitude of a given process at every spatial location in the image sequence. Data compression and noise reduction can be achieved using the same spatially-invariant linearly-additive model and a variation of the simultaneous diagonalization filter.
While linear cameras offer substantial advantages over RS-170 cameras for many vision applications, the lack of suitable high-performance image-processing hardware has significantly limited their potential benefits. The very large images and high data rates associated with linear cameras impose significant processing problems that limit the capabilities of current linear-camera hardware designs. A highly desirable goal is to design new hardware architectures that will significantly improve processing capabilities to exploit the advantages of linear cameras without the need for frame buffers with their inherent restrictions on image size and format. This paper describes an approach that offers significant improvements for linear-camera based vision systems without requiring frame buffers. The proposed design advocates the use of lookup tables compatible with low-cost VLSI technology to perform extensive additional low-level processing. Hardware to perform data compression and extraction of key information from image data is also covered. The overall objectives of this architecture include providing low-cost hardware, flexibility, and ease of programming.
This paper describes the design and implementation of a low-cost machine vision system for identifying various types of automotive wheels which are manufactured in several styles and sizes. In this application, a variety of wheels travel on a conveyor in random order through a number of processing steps. One of these processes requires the identification of the wheel type which was performed manually by an operator. A vision system was designed to provide the required identification. The system consisted of an annular illumination source, a CCD TV camera, frame grabber, and 386-compatible computer. Statistical pattern recognition techniques were used to provide robust classification as well as a simple means for adding new wheel designs to the system. Maintenance of the system can be performed by plant personnel with minimal training. The basic steps for identification include image acquisition, segmentation of the regions of interest, extraction of selected features, and classification. The vision system has been installed in a plant and has proven to be extremely effective. The system properly identifies the wheels correctly up to 30 wheels per minute regardless of rotational orientation in the camera's field of view. Correct classification can even be achieved if a portion of the wheel is blocked off from the camera. Significant cost savings have been achieved by a reduction in scrap associated with incorrect manual classification as well as a reduction of labor in a tedious task.
A class of image sequences can be characterized as being spatially invariant and linearly additive based on their image formation processes. In these kinds of sequences, all features are positionally invariant in each image of a given sequence but have varying gray-scale properties. The various features of the scene contribute additively to each image of the sequence but the image-formation processes associated with given features have characteristic signatures describing the manner in which they vary over the image sequence. Examples of appropriate image sequences include multispectral image sequences, certain temporal image sequences, and NMR image sequences generated by modification of the excitation parameters. Note that image sequences can be formed using a variety of imaging modalities as long as the linearly additive and spatially invariant requirements are not violated. Features associated with different image-formation processes generally will have unique signatures that can be used to generate linear filters for isolating selected image-formation processes or for performing data compression. Starting with an explicit mathematical model, techniques are presented for generating optimal filters using simultaneous diagonalization for enhancement of desired image-formation processes and data compression with this class of image sequences. A unique property of this approach is that even if several image-formation processes occupy a given pixel, they can still be isolated.
While linear cameras offer substantial advantages over standard television cameras for many vision applications, the lack of suitable high-performance image-processing hardware has significantly limited their potential benefits. Very powerful image-processing hardware is available for matrix cameras and, although many of these systems have provisions for interfacing to linear cameras, they restrict the inherent flexibility and power of linear cameras. The very large images and high data rates associated with linear cameras impose significant processing problems with existing linear-camera hardware designs with their very limited processing capability. Hence, a highly desirable objective is new hardware that will provide significantly improved processing capability without the need for frame buffers with their inherent restrictions on image size and format. A variety of approaches are being evaluated for enhancing linear-camera processing architectures using processing power, cost effectiveness, flexibility, and ease of programming as the primary criteria.