The standard technique for estimating the size, location, and angle of rotation of an elliptically shaped blob in a binary image requires the calculation of the first and second moments of the blob. This approach requires that the blob be complete. If only part of the blob is present, this approach gives erroneous results. This paper presents techniques for fitting circles and ellipses to partial blobs using least-squares techniques. In the simpler cases, the solutions are in closed form, so that no iteration is required. In the more difficult problems some of the unknown parameters are eliminated, so that iteration is required on only one or two parameters. Partial- outline fitting techniques are also provided for triangles, squares, and hexagons.
By combining two powerful optical computing techniques, namely, optical symbolic substitution (OSS) and polarization-encoded optical shadow-casting (POSC), an optical morphological hit-or-miss transformation is demonstrated and used in the recognition of perfect and imperfect shapes.
This report describes an image enhancement processor which uses mathematical morphology to improve contrast. Our goal is the realtime enhancement of cardiac angiograms and the near- realtime enhancement of peripheral angiograms (e.g., arms and legs.) It consists of two systolic arrays for morphological processing and an image combination unit for simple pixel by pixel arithmetic and logical operations. The arrays and image combination unit will be integrated into a standard framework (Datacube MAXbus) for image capture, storage, and display. The rest of this report is structured in the following way: The requirements, basic enhancement technique and processor structure are given in the next section. Then the design of the systolic arrays, image combination, and delay units are presented followed by a discussion of the Datacube image processing framework. An organization level simulation was developed to validate the systolic design and results are summarized. Finally, conclusions are made and open issues are discussed.
A new technique for recognizing partially occluded or incomplete 2-D representations of man made objects is presented here. Recognition is performed without regard for orientation, position, and scale of the objects. This technique based on Normalized Interval Vertex Descriptors (NIVD) a representation derived from the physical characteristic of an object (vertices and sides) that is easy to obtain, especially for polygon like shapes. NIVD shape representation is a periodic scale, rotation and translation invariant function of the contour of the object. Although global in nature, NIVDs, through a subtemplate matching and a renormalization process of the matching sections, can be used for partial shape recognition. Objects on the scene are identified as belonging to a given class by finding the largest partial match of the object representation and a previously acquired clean representation of the class model. Experimental results of this technique are also included.
Fast Fourier Transforms (FFTs) are frequently employed in various applications such as image processing and speech recognition. Though FFT calculations can be speeded up considerably, real time processing requirements are well above that of modern day uniprocessor systems. Computing power can be substantially increased through the exploitation of the inherent parallelism available in FFT calculations. However, experimental performance analysis of the Parallel FFT (PFFT) algorithm has not been sufficiently investigated in a loosely coupled multiprocessor environment. In this paper we evaluate the implementation of a PFFT on a network of T800 series transputers connected in the form of a grid. We analyze the speedup obtained taking into account both computation load and communication overhead. A rudimentary load balancing algorithm has been incorporated so that load balancing takes into account both computation and communication loads. Realistic performance figures are provided taken through actual measurements on the system and they are compared with figures obtained from an analysis of practical complexity of the implementation.
Machine vision algorithms are partitioned into three distinct levels: low level (image to image transformation), intermediate level (image to symbolic transformations) and high level (symbolic manipulation). Low level image processing requires a large amount of data manipulation which can be cost effectively processed using a linear processor array. When performing intermediate level operations, however, the linear processor array is liable to generate communications bottlenecks which reduce its efficiency. The aim of this work is to enhance the linear processor array architecture for intermediate level processing. An investigation of the matching of intermediate level algorithms to appropriate computer architectures is presented. Consequently an augmented tree-structured MIMD processor network is devised, tightly coupling the low, intermediate, and high level image processing stages. The network is realized using the Inmos Transputer. A representative selection of intermediate level algorithms are executed on the machine. The performance of the realized network is compared to several commercially available systems. As the network is increased in size, it is shown that the communications bottlenecks in the linear processor array are reduced to a negligible amount. Future enhancements to the system are finally considered, including automated object and feature recognition, and a tree structured hierarchy of Prolog processes.
This paper describes an efficient approach, developed by the authors, for labelling images using a combination of pipeline (Datacube) and host (general purpose computer) processing. The output of the algorithm is a coordinate list of labelled object pixels that facilitates further high level operations.
The development of a vision system for fast recognition of `flat'' rigid 2-D representations of man made objects belonging to a large library is presented here. The library is organized into a relational database where every tuple represents a prototype class. Tuples are made of attributes defined over features characterizing the object. Classification speed is gained by constraining the number of comparisons of the unknown object representations to those in the library of known objects. Comparisons are only made on a `candidate'' set of possible prototypes (usually smaller than the total number of prototypes) derived from the library `object'' set through the application of successive refinement filters (restriction operators) operating on the database attributes. Normalized Interval Vertex Descriptors (NIVD) representation is used to describe objects. NIVDs, a representation derived from the physical characteristic of an object (vertices and sides), not only provides a compact representation, but it also allows the definition of attributes that can be used to define the relation. In addition, since NIVD representation is translation, rotation, and scale invariant, recognition is performed without regard for orientation, position, and scale of the objects. Experimental results of this process are also included.
The computational resources needed to implement image processing algorithms exceed the capability of most current VMEbus architectures. This is no reason to abandon these systems that are widely used, if we can complement them with the appropriate tools. In this paper we describe an architecture based on Digital Signal Processors (DSPs), VMEbus compatible, to accomplish the calculation intensive routines of the image processing algorithms.
Pipelined image processing hardware has become increasingly popular because it makes it possible to build real-time machine vision systems at reasonable cost. Unfortunately, this type of hardware is often difficult to program, and the difficulty increases rapidly as the machines become more flexible and powerful. In this paper we present PRISM, a visual programming language that supports rapid prototyping and algorithm development on pipelined image processors. Computations are represented by graphs whose nodes are data transformations and whose arcs are data paths. The system allows the user to build and edit graphs and attach attributes to graph nodes specifying details of the computation (gains, masks, et cetera). Once the graph is adequately connected, the system traverses the graph, analyzes the data dependencies, and constructs an execution schedule. It then repeatedly executes the schedule, mapping graph nodes to specific hardware resources as needed. We discuss the overall architecture of the system, describe the class of hardware devices to which it is applicable, and then present an implementation for the Datacube MV 20. We analyze the implementation in terms of how well it makes use of the underlying hardware, and discuss ways of improving its efficiency.
We present a fully automated system which unites CCD camera technology with liquid crystal technology to create a polarization camera capable of sensing the polarization of reflected light from objects at pixel resolution. As polarization affords a more general physical description of light than does intensity, it can therefore provide a richer set of descriptive physical constraints for the understanding of images. Recently, it has been shown that polarization cues can be used to perform dielectric/metal material identification, specular and diffuse reflection component analysis, as well as complex image segmentations that would be immensely more complicated or even infeasible using intensity and color alone. Such analysis has so far been done with a linear polarizer mechanically rotated in front of a CCD camera. The full automation of resolving polarization components using liquid crystals not only affords an elegant application, but reduces the amount of optical distortion present in the wobbling of a mechanically rotating polarizer. In our system two twisted nematic liquid crystals are placed in front of a fixed polarizer placed in front of a CCD camera. The application of a series of electrical pulses to the liquid crystals in synchronization with the CCD camera video frame rate produces a controlled sequence of polarization component images that are stored and processed on Datacube boards. We present a scheme for mapping polarization states into hue, saturation, and intensity which is a very convenient representation for a polarization image. Our polarization camera outputs such a color image which can then be used in polarization- based vision methods. The unique vision understanding capabilities of our polarization camera system are demonstrated with experimental results showing polarization-based dielectric/metal material classification, specular reflection, and occluding contour segmentations in a fairly complex scene, and surface orientation constraints for object recognition.
This paper describes on-going research into machine vision systems based on the line-scan or linear array type cameras. Such devices have been used successfully in the production line environment, as the inherent movement within the manufacturing process can be utilized for image production. However, applications such as these have traditionally involved using the line-scan device in a purely two-dimensional role. Initial research was carried out to extend such 2-D arrangements into a 3-D system, retaining the lateral motion of the object with respect to the camera. The resulting stereoscopic camera allowed three-dimensional coordinate data to be extracted from a moving object volume (workspace). The most recent work has involved rotating line-scan systems in relation to a static scene. This allows images to be produced with fields of view varying in both size and position in the rotation. Due to the nature of the movement the images can be complex dependent on the size of the field of view selected. Benefits of obtaining images in this fashion include `all-round' observation, variable resolution in the movement axis and a calibrated volume that can be moved to observe any point in a 360 degree arc.
While linear cameras offer substantial advantages over RS-170 cameras for many vision applications, the lack of suitable high-performance image-processing hardware has significantly limited their potential benefits. The very large images and high data rates associated with linear cameras impose significant processing problems that limit the capabilities of current linear-camera hardware designs. A highly desirable goal is to design new hardware architectures that will significantly improve processing capabilities to exploit the advantages of linear cameras without the need for frame buffers with their inherent restrictions on image size and format. This paper describes an approach that offers significant improvements for linear-camera based vision systems without requiring frame buffers. The proposed design advocates the use of lookup tables compatible with low-cost VLSI technology to perform extensive additional low-level processing. Hardware to perform data compression and extraction of key information from image data is also covered. The overall objectives of this architecture include providing low-cost hardware, flexibility, and ease of programming.
The paper presents a new technique for efficient polygon approximation of digitized planar curves. The polygon approximation algorithms based on sequential scan, split-and-merge and iterative techniques have some drawbacks: they shift the corner points of the given curve, they distort the original symmetry of the curve, polygon approximation is dependent on starting point and the starting points are taken as break points, they cannot preserve the identity of the segments whose lengths lie between (epsilon) and 2 (epsilon) , where (epsilon) is allowed maximum absolute deviation error. The proposed technique grows the edges of polygon approximation which is based on principle of merging. The edge/s are grown at point/s where the minimum merging error is produced. This simultaneous growing of edges overcomes the drawbacks present in the sequential scan, the split-and-merge, and the iterative techniques of polygon approximation. Merging is done on the sides of the initial polygon approximation obtained by template matching. The technique provides a scope for parallel implementation of the total task.
Scanning probe microscopy (SXM), which includes techniques such as scanning tunneling microscopy (STM) and scanning force microscopy (SFM), is becoming increasingly popular for analyzing surface structure at the sub-micron level. As the probe used for scanning is non- ideal, the image output by SXM is dependent on the shape and size of the probe. The use and success of SXM strongly depend on methods for ensuring the accuracy of the images produced by SXM. In this paper, we derive models of the effects of the probe shape geometry on the image produced by SXM. Methods are formulated for recovering the true surface from the imaged surface and for indicating where the surface reconstruction is exact and where it is uncertain. We formulate these methods both for images scanned in a `contact' mode and those scanned in a `non-contact' mode. It is shown that scanning in a non-contact mode by a non- ideal probe is equivalent to scanning in a non-contact mode by an ideal probe followed by scanning in a contact mode by the non-ideal probe. The methods developed in this paper can be used to recover a surface scanned by a scanning probe microscope, given the shape of the probe used for scanning, and for visualizing the scanning and recovery of surfaces by different probe shapes.
We use the paraxial geometric optics model of image formation to derive a set of camera focusing techniques. These techniques do not require calibration of cameras but involve a search of the camera parameter space. The techniques are proved to be theoretically sound. They include energy maximization of unfiltered, low-pass filtered, high-pass filtered, and band-pass filtered images. It is shown that in the presence of high spatial frequencies, noise, and aliasing, focusing techniques based on band-pass filters perform well. The focusing techniques are implemented on a prototype camera system named SPARCS. The architecture of SPARCS is described briefly. The performance of the different techniques are compared experimentally. All techniques are found to perform well. One of them -- the energy of low pass filtered image gradient -- which has better overall characteristics is recommended for practical applications.
This work is related to medical robotics applied to endoscopic image processing. When operating, the surgeon uses both an endoscope and a color video camera but does not get any 3-D information of the analyzed cavity. The use of only one camera has led us to use the concept of axial stereovision. In this paper, we present a matching process using color information in order to recover the 3-D shapes of non-polyhedric objects (such as organs) with a zoom sensor.
The accepted method of programming machine vision systems for a new application is to incorporate sub-routines from a standard library into code, written specially for the given task. Typical programming languages that might be used here are Pascal, C, and assembly code, although other `conventional' (i.e., imperative) languages are often used instead. The representation of an algorithm to recognize a certain object, in the form of, say, a C language program is clumsy and unnatural, compared to the alternative process of describing the object itself and leaving the software to search for it. The latter method, known as declarative programming, is used extensively both when programming in Prolog and when people talk to one another in English, or other natural languages. Programs to understand a limited sub-set of a natural language can also be written conveniently in Prolog. The article considers the prospects for talking to an image processing system, using only slightly constrained English. Moderately priced speech recognition devices, which interface to a standard desk-top computer and provide a limited repertoire (200 words) as well as the ability to identify isolated words, are already available commercially. At the moment, the goal of talking in English to a computer is incompletely fulfilled. Yet, sufficient progress has been made to encourage greater effort in this direction.
In the nuclear fuels industry, a great deal of effort goes into ensuring that quality materials are produced. Of these materials, none receives more attention than the uranium-oxide nuclear fuel pellets. These cylindrically shaped pellets (approx. 1/2 inch L X 1/2 inch D) are carefully produced and then meticulously inspected for various defects (e.g., cracks, chips, etc.). The inspection process is designed to remove any defective pellets from each lot, assuring the end user a reliable, predictable, and safe product. The current (manual) inspection process is laborious and subjective in nature. The inspector also receives prolonged exposure to low-level radiation. For these reasons, automated inspection of nuclear fuel pellets has long been a goal of the industry. However, it is not a simple task, due to the many material handling and image processing challenges required to inspect pellets at production rates (greater than five per second). This paper describes an automated nuclear fuel pellet inspection system that has successfully met these challenges. Built around a set of modular, high-speed, pipelined image processing hardware, it inspects pellets at rates of up to seven pellets per second. Recent tests have shown better than 97% detection rates with less than 2% false reject rates. Image processing algorithms and solutions to design challenges are described.
This paper describes the design and implementation of a low-cost machine vision system for identifying various types of automotive wheels which are manufactured in several styles and sizes. In this application, a variety of wheels travel on a conveyor in random order through a number of processing steps. One of these processes requires the identification of the wheel type which was performed manually by an operator. A vision system was designed to provide the required identification. The system consisted of an annular illumination source, a CCD TV camera, frame grabber, and 386-compatible computer. Statistical pattern recognition techniques were used to provide robust classification as well as a simple means for adding new wheel designs to the system. Maintenance of the system can be performed by plant personnel with minimal training. The basic steps for identification include image acquisition, segmentation of the regions of interest, extraction of selected features, and classification. The vision system has been installed in a plant and has proven to be extremely effective. The system properly identifies the wheels correctly up to 30 wheels per minute regardless of rotational orientation in the camera's field of view. Correct classification can even be achieved if a portion of the wheel is blocked off from the camera. Significant cost savings have been achieved by a reduction in scrap associated with incorrect manual classification as well as a reduction of labor in a tedious task.
In this paper we put forward a framework for the flexible packing of planar shapes, of random shape and size, under visual control. The basic aim of this system is not only to be able to produce an efficient packing strategy, but a strategy that is also flexible enough for industrial use, and as such the method taken emphasizes the systems approach to dealing with industrial vision problems. This framework consists of two major components, namely a morphological based geometric packing approach used in conjunction with a heuristic packing procedure. Some of the considerations included at a heuristic level include shape ordering and shape orientation, both of which must be carried out prior to the application of the shapes to the geometric packer. The heuristic component also deals with the context information specific to our application. We also discuss the various issues that arise from this approach, such as the systems properties and performance, within the background of some sample applications. The ideas outlined in this paper are currently being used in the development of a visually controlled intelligent packing work cell.
The pad analysis system (PAS) is an automated visual inspection system developed through a joint effort of the research and manufacturing divisions. It inspects wafers for low volume solder balls. Solder balls (also called pads or bumps) are used to join semiconductor chips to substrates. When low volume solder balls fail to join, there is a resulting open circuit defect. PAS is a cost effective method of providing customers with a high quality level. The solder ball manufacturing process, while providing excellent quality, is not capable of producing the levels needed by the sophisticated chips and modules used in mainframes. PAS also provides a benefit by increasing yield. Its increased accuracy reduces the overkill associated with manual inspection. PAS is also programmed with pad functionality information about which non- critical low volume pads can be safely shipped. This information is typically too complex to be implemented manually. PAS significantly reduces rework, generates labor savings, and even improves yield. The PAS is an example of both a productive collaboration and a successful technology transfer between research and manufacturing. It is also the story of incremental improvements in hardware, software, and function which allowed the system to inspect increasingly complex chips with decreased cycle time and reduced operator intervention.
In this paper we describe the major aspects in our transputer-based automatic vision system aiming to implement a scaleable and easily reconfigurable system. In this system the mapping of image processing and recognition algorithms to the hardware is facilitated by automatic code generation schemes, separating methodic design and implementation details.
SailSpy is a real-time vision system which we have developed for automatically measuring sail shapes and masthead rotation on racing yachts. Versions have been used by the New Zealand team in two America's Cup challenges in 1988 and 1992. SailSpy uses four miniature video cameras mounted at the top of the mast to provide views of the headsail and mainsail on either tack. The cameras are connected to the SailSpy computer below deck using lightweight cables mounted inside the mast. Images received from the cameras are automatically analyzed by the SailSpy computer, and sail shape and mast rotation parameters are calculated. The sail shape parameters are calculated by recognizing sail markers (ellipses) that have been attached to the sails, and the mast rotation parameters by recognizing deck markers painted on the deck. This paper describes the SailSpy system and some of the vision algorithms used.
This paper mainly addresses the problem of automatic segmentation and automatic recognition of weld defects by x-ray inspection. The task of automatic segmentation is to segment the x- ray image accurately into regions of weld defects and regions of backgrounds. The task of automatic classification is to classify a defect into different types of weld defects which may be caused by different wrong weldings. Novel and effective algorithms are proposed and discussed.
A car license plate reader (CLPR) using fuzzy inference and neural network algorithm has been developed in Industrial Technology Research Institute (ITRI) and installed in highway toll stations to identify stolen cars. It takes an average of 0.7 seconds to recognize a car license plate by using a PC with 80486-50 CPU. The recognition rate of the system is about 97%. The techniques of CLPR include vehicle sensing, image grab control, optic pre- processing, lighting, and optic character recognition (OCR). The CLPR can be used in vehicle flow statistics, the checking of stolen vehicles, automatic charging systems in parking lots or garage management, and so on.
Forestry has for many years been a major New Zealand industry, within which the manufacture of reconstituted products from wood fiber is becoming increasingly significant. The demand for a consistently high-quality surface finish in products, such as medium density fiberboard panels, introduces inspection requirements that cannot be easily met by manual inspection. This paper discusses the development of a prototype inspection system for wood panels to detect and classify the various defect types at production rates. The range of surface defects occurring during the manufacture of this product includes those having both color and textural variations. With some of these being quite small and subtle, the processing requirements are major. The prototype uses a combination of general purpose processor and pipelined processing modules to process images obtained from the moving product.
Hardware capable of recognizing the `named' colors (e.g., `red,' `yellow,' orange, etc.,) is available now at modest cost. This has been interfaced to a standard computer, running Prolog. The result is a powerful combination, capable of intelligently interpreting colored images, such as those on simple product packaging. The structure and applications of such a system are described. Prolog programs are presented which are capable of recognizing bananas, flags, and dragons. Learning color patterns is also discussed.
This paper describes the prototype design of a real time color image compression board. The hardware implements image compression as defined by the Joint Photographic Experts Group commonly referred to as the JPEG standard. The architecture and supported image compression modes are described. The design utilizes LSI Logic's JPEG chipset and additional supporting hardware and resides on a 6U VME ProtoMax II prototyping board from Datacube. The design conforms to Datacube's MAXbus specification and can be run at full RS-170 frame rates. Data is input to the board in a 10 MHz pixel stream and emerges from the board at 10 MHz in a compressed format with appropriate byte stuffing and image marker codes. Future directions for supporting MPEG are discussed.
The semi-automated film video reader system (SAFVR) is an integrated system for motion sequence analysis, including acquisition, qualitative analysis, quantitative analysis, and storage of tracks and images. The SAFVR system can digitize high resolution images from film and video, save the digitized images to disk, perform object tracking for rigid bodies, and produce video tapes for presentation of analysis results. The tracking is based on a hierarchical correlation matching algorithm.
In this paper active constraint set methods are applied with classical lagrange multiplier analysis to recover constrained model parameters from monocular images. Specific cases are shown from a number of complex models that demonstrate that the convergence process correctly recovers the original parameters from small amounts of matching data, relative to the large number of parameters and constraints describing the models. Application domains involved are real-time tracking of parametric models and calibration of vision equipment for factory settings.
A multi-purpose hardware system for processing images at video rates is described. Image sequence hardware for temporal analysis in realtime (ISHTAR) uses 18 TI TMS320c40 (c40) DSPs to process input from a CCD camera or VCR source. The hardware architecture consists of a pipeline of nine processor boards, each with two c40 processors, the whole system being synchronized by the vertical sync of the input device. This enables the calculation of a number of two dimensional convolutions to be achieved at video frame rates with a delay between the input and the output dictated by the length of the pipeline. The system is fully reconfigurable in software and partially reconfigurable in hardware so that many different types of image processing algorithms can be implemented. The specific application of a generalized gradient model to measure image motion is described, outlining the particular program structure dictated by the hardware design. The SUN 4 host has access to each processor and has the ability to change parameters and program control while the system is running. In this way active control feedback loops can be employed, particularly when the motion of the camera is under the host control, forming an active vision system. Simulations using real image sequences are presented.
The manufacture of food products for human consumption is an operation requiring strict levels of quality assurance to ensure that no foreign material is entrained in the final product. Once the product has been packaged, the options for inspection are severely limited. The machine vision team at Industrial Research Ltd. has, on a number of occasions, undertaken the inspection of large quantities of such products using x-ray video imaging and visual inspection. To perform such an operation manually is at best tedious, but also entails a degree of concentration that is difficult to maintain over long periods of time. This paper discusses the development of a real-time system for automatically inspecting canned products. The system uses high speed vision hardware to inspect the contents of each can. The system is capable of automatically rejecting cans containing foreign material.
Production of integrated circuits necessitates inspection of masks, chips, and wafers to guarantee yield and quality. This paper presents a simple, low cost mask inspection system. The system differs from the commercially available mask inspection systems in an important way. Our system is based on an IBM compatible PC, and provides a simple and cost effective solution for the application described. The hardware and software of the system have been modularized with the objective to make the system more versatile and reconfigurable. The basic system can easily be coupled with various front ends. Thus, with slight modifications in the software, the system can be used to pursue other applications as well. The inspection algorithms make use of reference comparison and feature extraction approaches for guaranteed defect detection. The defects detected are further analyzed for extraction of characteristic features like location, dimensions, and type to create a diagnostic report at the end of inspection. The defect data in the report can be used for online mask repair. Preliminary experiments with the system have shown promising results. The configuration of the system along with the image processing algorithms used are detailed. The paper ends with a brief discussion on the results obtained.
Machine vision techniques are spanning a number of application domains. This paper presents a successful attempt of using these techniques for interpretation and manipulation of engineering drawings. Microfilming technique is widely used for archiving and increasing the portability of engineering drawings. Retrieval of these drawings for further modifications and updating is a cumbersome task. CAD tools cannot be used for this purpose as these drawings do not cater to the required format. A simple PC based machine vision system, which acts as an interface for converting the microfilmed drawings into a file accessible by the specific CAD tool is described. Image analysis routines have been used for contour tracking, detection of critical points, determining orientation, length, perimeter, etc. Features of each individual pattern are extracted for creating a database. The user is provided an option of choosing the specific CAD tool and a file in accordance with the format required by the CAD tool is generated from the database. The paper deals with the software developed and reports the results obtained.
There are many products that are produced as a continuous ribbon, and contain repeated patterns or features. There is a need for unsupervised learning of these products so that automated inspection can be performed. With many inspection tasks however, the problem is not deciding what class of product is being examined, but to distinguish a good product from a bad product. With established classification methods, it would be necessary to present a representative sample of all `bad' products to the system for training, as well as a `good' class. It is highly improbable that this could be achieved within the workings of a production factory. Automated inspection requires recognition techniques that train on only good samples, or one- class learning/recognition. This paper describes a machine vision method which learns from good examples shown to the system. From this, a knowledge base is created and used for the subsequent inspection of these patterns.