A two-level processing scheme for real-time image understanding is proposed, where an example-based reasoning in neural AI systems is introduced. The system has tow levels; component level and structure level. At the component level, an elementary pattern recognition is performed as in the conventional pattern recognition, while the syntax pattern recognition is done at the structure level. Both levels are essentially time-consuming. The pattern recognition assisted by syntax recognition reduces the total complexity of processes, and the system can perform a real-time image understanding, when the VLSI chips are introduced. As a result, we show a reasonable real-time image understanding scheme by introducing a neural pattern recognition at the component level and a case-based AI technique at the structure level.
Zero-crossings extracted from LoG filtering are widely used for pattern recognition and computer vision. Recursive filtering techniques have been applied to reduce the computational complexity of LoG filtering. However, extracting zero-crossings remains a time-consuming task and usually requires a second scanning process after filtering. This paper presents an algorithm using a line buffer to extract zero-crossings along with LoG filtering hence avoiding the need for an additional convolution and extra memory units. Because we know where the sign of two pixels crosses zero, the exact location of each zero-crossing can be calculated down to sub-pixel level by linear interpolation. The line buffer can be implemented easily in hardware, and real-time processing can be achieved.
The nonlinear anisotropic diffusive process has shown the good property of eliminating noise while preserving the accuracy of edges, and has been widely used in image processing. However, filtering depends on the threshold of the diffusion process, i.e., the cut-off contrast of edges. The threshold varies form image to image and even from region to region within an image. The problem compounds with intensity distortion and contrast variation. We have developed an adaptive diffusion scheme by applying the Central Limit Theorem to selecting the threshold. Gaussian distribution and Rayleigh distribution are used to estimate the distributions of visual objects in images. Regression under such distributions separates the distribution of the major object from other visual objects in a single peak histogram. The separation helps to automatically determine the threshold. A fast algorithm is derived for the regression process. The method has been successfully used in filtering various medical images.
A point-symmetry function based on autoconvolution is described which makes it possible to track the position of point-symmetric objects with sub-pixel accuracy. The method is insensitive to grey level and was developed in order to have a fast and robust algorithm for real-time tracking of small magnetic particles in a light microscope. The phase contrast microscope image of the 4.5 micrometers diameter spherical particle consisted of concentric light and dark fringes where the shape of the fringes were dependent on the focus. THe position of the particle could be monitored in real-time at 25 Hz with a lateral accuracy of +/- 20 nm corresponding to less than +/- 0.1 pixel. To determine the vertical or z-position a new parameter was defined representing a measure of the second derivative of the intensity function. The vertical position could thus be determined with an accuracy of +/- 50 nm. The magnetic particle could be tracked and guided by applied magnetic fields to remain in a fixed position or programmed to scan either horizontal or vertical surfaces. Forces down to 10-14 N could by measured by monitoring the applied magnetic forces. One and two-dimensional Brownian motion could be studied by regulating the particle to a fixed z- position and monitoring the lateral position.
Real-time vision has many applications in the area of semiconductor manufacturing. Typical processes use machine vision for alignment, inspection, measurement, process control, and quality control. This paper describes one application where machine vision is used in conjunction with a diffraction based optical technique to measure linewidth for use in critical dimension control.
Inspection and identification of cylindrical or conical shaped objects presents a unique challenge for a machine vision system. Due to the circular nature of the objects it is difficult to image the whole object using traditional area cameras and image capture methods. This work describes a unique technique to acquire a 2D image of the entire surface circumference of a cylindrical/conical shaped object. The specific application of this method is the identification of large caliber ammunition rounds in the field as they are transported between or within vehicles. The proposed method utilizes a line scan camera in combination with high speed image acquisition and processing hardware to acquire images from multiple cameras and generate a single, geometrically accurate, surface image. The primary steps involved are the capture of multiple images as the ammunition moves by on the conveyor followed by warping to correct for the distortion induced by the curved projectile surface. The individual images are then tiled together to form one 2D image of the complete circumference. Once this image has been formed an automatic identification algorithm begins the feature extraction and classification process.
An initial step in goal-oriented dynamic vision is tracking a nonstationary object, or target, and maintaining its position in the center of the field-of-view for detailed analysis. Any image analysis performed by a dynamic vision system must be able to clearly distinguish between the image flow generated by the changing position of the camera and by the movement of potential targets. Many image-based motion analysis techniques are, however, unable to deal effectively with the complexities of dynamic vision because they attempt to calculate true velocities and accurately reconstruct 3D depth information from spatial and temporal gradients. An alternative pattern classification technique has been developed for qualitatively identifying regions in the image plane which most likely correspond to moving targets. This approach is based on the notion that all projected velocities arising from a camera moving through a rigid environment will lie along a line in the local velocity space. Each point on this constraint line maps a circle that represents all corresponding velocities that are parallel to the direction of the spatial gradient. If the camera motion is known, then the gradient-parallel velocity vectors associated with an independently moving target are unlikely to fall in the region arising from the union of all the circles generated by the points along the constraint line. Imprecise or approximate knowledge of the camera motion can be utilized if the projected velocities associated with the constraint line are modeled as radial fuzzy sets with supports in the local velocity space. Homogeneous regions in the image plane that violate these camera velocity constraints become possible fixation points for advanced tracking and detailed scene analysis.
The intelligent transportation system has generated a strong need for the development of intelligent camera systems to meet the requirements of sophisticated applications, such as electronic toll collection (ETC), traffic violation detection and automatic parking lot control. In order to achieve the highest levels of accuracy in detection, these cameras must have high speed electronic shutters, high resolution, high frame rate, and communication capabilities. A progressive scan interline transfer CCD camera, with its high speed electronic shutter and resolution capabilities, provides the basic functions to meet the requirements of a traffic camera system. Unlike most industrial video imaging applications, traffic cameras must deal with harsh environmental conditions and an extremely wide range of light. Optical character recognition is a critical function of a modern traffic camera system, with detection and accuracy heavily dependent on the camera function. In order to operate under demanding conditions, communication and functional optimization is implemented to control cameras from a roadside computer. The camera operates with a shutter speed faster than 1/2000 sec. to capture highway traffic both day and night. Consequently camera gain, pedestal level, shutter speed and gamma functions are controlled by a look-up table containing various parameters based on environmental conditions, particularly lighting. Lighting conditions are studied carefully, to focus only on the critical license plate surface. A unique light sensor permits accurate reading under a variety of conditions, such as a sunny day, evening, twilight, storms, etc. These camera systems are being deployed successfully in major ETC projects throughout the world.
This paper proposes a new algorithm for tracking objects and objects boundaries. This algorithm was developed and applied in a system used for compositing computer generated images and real world video sequences, but can be applied in general in all tracking systems where accuracy and high processing speed are required. The algorithm is based on analysis of histograms obtained by summing along chosen axles pixels of edge segmented images. Edge segmentation is done by spatial convolution using gradient operator. The advantage of such an approach is that it can be performed in real-time using available on the market hardware convolution filters. After edge extraction and histograms computation, respective positions of maximums in edge intensity histograms, in current and previous frame, are compared and matched. Obtained this way information about displacement of histograms maximums, can be directly converted into information about changes of target boundaries positions along chosen axles.
The use of an off-the-shelf general purpose processing system supplied by Giga Operations as applied to real-time video applications is described. The system is modular enough to be used in many scientific and industrial applications and powerful enough to maintain the throughput required for real-time video processing. This hardware and the associated programming environment has enabled InterScience to pursue research in real-time data compression, real-time Electronic Speckle Pattern Interferometry (ESPI) image processing, and industrial quality control and manufacturing. The system is based on Xilinx 4000 series field programmable gate arrays with associated static and dynamic random access memory in an architecture optimized for video processing on either the VL-Bus or PCI. This paper will focus on the design and development of a real-time frame subtractor for ESPI using this technology. Examples of the improvement in research capability provided by real-time frame subtraction are shown, including images from biomedical experiments. Further applications, based on this system are described. These include real-time data compression, quality control for production lines as part of an automated inspection system and a multi-camera security system allowing motion estimation to automatically prioritize camera selection.
Time gradient can be used to extract information from motion.It has been already done, but higher performances are reached if it's computed on real time. We propose an architecture to evaluate it from consecutive images in a video signa. Behind a delaying structure, a circuit operates over the neighborhood of pixels around each one: not only the adjacent pixels in space but also in time. Gradient can be extracted from all of them by convolution, and other nonlinear algorithms can also be applied. Known 3 by 3 masks to operate over static images are generalized by 3D masks to operate over dynamic images. The circuit is based on field programmable gate array (FPGA) devices, then a set of specific purpose hardware designs can be loaded on RAM cells in FPGA, so it's fast as hardware and versatile as software, and different approaches of gradient can be implemented from host computer to programmable logic in FPGA.
Single-chip digital cameras use a color filter array and subsequent interpolation strategy to produce full-color images. WHile the design of the interpolation algorithm can be grounded in traditional sampling theory, the fact that the sampled data is distributed among three different color planes adds a level of complexity. Previous ways of treating this problem were based on computationally intensive approaches, such as iteration. Such methods, while effective, cannot be implemented in todays crop of digital cameras due to the limited computing resources of the cameras and the accompanying host computers. These previous methods are usually derived from general numerical methods that do not make many assumptions about the nature of the data. Significant computational economies, without serious losses in image quality, can be achieved if it is recognized that the data is image data and some appropriate image model is assumed. To this end, the design of practical, high- quality color filter array interpolation algorithms based on a simple image model is discussed.
This paper presents a real-time digital signal processing system to detect the presence of pedestrians waiting to cross an intersection. A pedestrian detection algorithm is developed and implemented on a TMS320C40 DSP chip such that videosignals captured and transmitted from an actual intersection are processed in real-time. The detection algorithm consists of a background update and a background subtraction components applied to image blocks. The background update is done in order to cope with brightness changes that occur in actual outdoor scenes.In addition, waiting and moving pedestrians are distinguished by using the temporal changes of image block intensities. The results obtained indicate that this DSP implementation provides an acceptable detection rate for this traffic management problem.
Pb0.9La0.09(Zr0.65,Ti0.35)0.9775O3 9/65/35) commonly used as an electro-optical shutter exhibits large phase retardation with low applied voltage. This shutter features as follows; (1) high shutter speed, (2) wide optical transmittance, and (3) high optical density in 'OFF'-state. If the shutter is applied to a diaphragm of video-camera, it could protect its sensor from intense lights. We have tested the basic characteristics of the PLZT electro-optical shutter and resolved power of imaging. The ratio of optical transmittance at 'ON' and 'OFF'-states was 1.1 X 103. The response time of the PLZT shutter from 'ON'-state to 'OFF'-state was 10 micro second. MTF reduction when putting the PLZT shutter in from of the visible video- camera lens has been observed only with 12 percent at a spatial frequency of 38 cycles/mm which are sensor resolution of the video-camera. Moreover, we took the visible image of the Si-CCD video-camera. The He-Ne laser ghost image was observed at 'ON'-state. On the contrary, the ghost image was totally shut out at 'OFF'-state. From these teste, it has been found that the PLZT shutter is useful for the diaphragm of the visible video-camera. The measured optical transmittance of PLZT wafer with no antireflection coating was 78 percent over the range from 2 to 6 microns.
Reports of training, where a simulator is run above normal speed, suggest potential benefits, but are confounded by the use of mixed training schedules. We report how two groups of participants were trained on a target tracking and acquisition simulator,with simulator speed as the only between-groups variable. Each group had two familiarization, and 20 training runs, where circular targets travelled in sets of two or three across an out-of-focus, digitized, real-world landscape, and could be tracked and acquired in a sequence chosen by each participant. The sole difference between groups was that one were given targets travelling 90 percent faster than normal speeds over the same distance. 24 hours later, each group was tested on sets of four targets travelling at normal speed. Those participants trained in accelerated conditions were significantly faster at acquiring each target than the control group, and equally accurate. The accelerated group came near to peak performance during training, while the control group were still improving throughout the test phase. It is considered that training under accelerated conditions can offer the alternative potential advantages of reaching a specified standard in a shorter training period, or of attaining a higher standard with the same amount of training time.
This paper proposes a pointing device using vision with gaze control which performs its task by tracking and recognizing user's hand. To realize this gaze control operation, i.e. restricting processing area within certain region and suppressing data around that region, we introduce the 'local-moment feature' defined as moments of local area of an image. Since the local moment has the capability to suppress unnecessary information outside of the area, it can be applied effectively to an image containing much noises and obstacle objects. Comparing with the ordinary moment, computational cost of the local moment is fairly small and can be easily applied to real-time system. Using proposed system, a user can point on a position on the screen by moving the hand, and can manipulate objects on the screen in several ways, such as 'pick', 'release' or 'tap', by forming fingers in particular shape. The proposed system has been implemented on personal computer with video capture board and the validity of the system has been shown by results of tracking and recognition experiments.
Multimedia system is a typical application of time-critical computing. In networked multimedia system such as video conferencing, real-time image communication is the key for system success. To address this problem, we employ time- critical computing technique. In our method, images are decomposed into intensity-information and color information. The intensity information of an image is preserved, and the color information may be spatially sub-sampled, resulting in different colore resolutions of images. When an image is transmitted over network, the intensity information of the image is transferred, then a low resolution color information is transferred. Time permitting, color information of higher resolution for the image can be incorporated. The result is that we can discard a large portion of the color image and let the visual system map the color information into the intensity image. With this method, the image data to be transferred over the network is greatly reduced, and the time-constraints of real-time image communications are satisfied. This technique is successfully used in our cooperative CAD systems.
We are going to consider a logical and physical architecture of an analog/digital video network. These networks allow full-screen real-time motion video distribution concurrently with regular network activities without slowing them down. The analog/digital video network uses two parallel communication media: analog - for transmitting video images and digital - for data transmission. Any video terminal in the network can be connected locally to a computer or to the analog video bus. A set of video terminals connected to the analog medium defines the current configuration of the video network. The configuration of the video network depends on a number of random real-time events. The configuration can be automatically changed by computing a set of the specially defined functions. These functions will be various for different applications. The computation of the functions and the resulting dynamic changes in the network configuration constitute the adaptive characteristic of the video network. A regular digital network is used to control the video network. For this purpose, a special management protocol was developed.
Kayser-Threde has been invested many years in developing technology used in crash testing, data acquisition and test data archiving. Since 1976 the department Measurement Systems has ben supplying European car manufacturers and test houses with ruggedized on-board data acquisition units for use in safety tests according to SAE J 211. The integration of on-board high-speed digital cameras has completed the data acquisition unit. Stationary high-speed cameras for external observation are also included in the controlling and acquisition system of the crash test site. The occupation of Kayser-Threde's department High Speed Data Systems is the design and integration of computerized data flow systems under real-time conditions. The special circumstances of crash test applications are taken into account for data acquisition, mass storage and data distribution. The two fundamental components of the video data archiving systems are, firstly, the recording of digital high-speed images as well as digital test data and secondly, an organized filing in mass archiving systems with the capability of near on-line access. In combination with sophisticated and reliable hardware components Kayser-Threde is able to deliver high performance digital data archives with storage capacities of up to 2600 TeraBytes.
We have developed a set of tools to build 3D images of vascular structures from contiguous slices. Slices were obtained from plastic-filled vascular casts. Major vascular branches on each slice were revealed and observed under the optical microscope. A series of video images of the branches on contiguous slices were then digitized. Each image was processed to isolate the vessels of interest. A user interface was built to select diameters, depths, and branching angles. A 3D image can then be viewed from any angle.