Advances in technology have enabled us to collect data from observations, experiments, and simulations at an ever increasing pace. As these data sets approach the terabyte and petabyte range, scientists are increasingly using semi-automated techniques from data mining and pattern recognition to find useful information in the data. In order for data mining to be successful, the raw data must first be processed into a form suitable for the detection of patterns. When the data is in the form of images, this can involve a substantial amount of processing on very large data sets. To help make this task more efficient, we are designing and implementing an object-oriented image processing toolkit that specifically targets massively-parallel, distributed-memory architectures. We first show that it is possible to use object-oriented technology to effectively address the diverse needs of image applications. Next, we describe how we abstract out the similarities in image processing algorithms to enable re-use in our software. We will also discuss the difficulties encountered in parallelizing image algorithms on the massively parallel machines as well as the bottlenecks to high performance. We will demonstrate our work using images from an astronomical data set, and illustrate how techniques such as filters and denoising through the thresholding of wavelet coefficients can be applied when a large image is distributed across several processors.
This paper presents a framework to add data and task parallelism to a sequential image processing library. The library contains 3 modules, one for low-level operators, the second for intermediate-level operators and the third for high-level operators. We parallelize the low-level operators by data decomposition and we are working at adding task parallelism at the image processing application level. We validate our data parallel approach by testing it with the geometric mean filter and the multibaseline stereo vision algorithm. Experiments on a cluster of workstations show very good speedup.
We are interested in running in parallel cellular automata. We present an algorithm which explores the dynamic remapping of cells in order to balance the load between processing nodes. The parallel application runs on a cluster of PCs connected by Fast-Ethernet. A general cellular automaton can be described as a set of cells where each cell is a state machine. To compute the next cell state, each cell needs some information from neighbouring cells. There are no limitations on the kind of information exchanged nor on the computation itself. Only the automaton topology defining the neighbours of each cell remains unchanged during the automaton's life. As a typical example of cellular automation we consider the image skeletonization problem. Skeletonization requires spatial filtering to be repetitively applied to the image. Each step erodes a thin part of the original image. After the last step, only the image skeleton remains. Skeletonization algorithms require vast amounts of computing power, especially when applied to large images. Therefore, skeletonization application can potentially benefit from the use of parallel processing. Two different parallel alogorithms are proposed, one with a static load distribution consisting in splitting the cells over several processing nodes and the other with a dynamic load balancing scheme capable of remapping cells during the program execution. Performance measurements shows that the cell migration doesn't reduce the speedup if the program is already load balanced. It greatly improves the performance if the parallel application is not well balanced.
This paper proposes Virtual Video Tape (VVT). It is a randomly accessible motion image recorder in main memory. VVT is realized with only software, not hardware. It is intended as a tool for real-time motion image understanding research. Recent remarkable progress of PC hardware enables to use gigabyte order main memory. By utilizing such large sized memory, there is a possiblity to realize motion image recorder with software. Thus we propose VVT as an example of this kind image recorder. Utilizing current components, recording time can be expected as minutes order. Since the proposed VVT is fully digital, there is no analog medium nor possibility for degradation of image quality. Since there is no deterioration of playback image and no rewinding, VVT must contribute to program development for motion image understanding. Based upon the proposed idea, the authors have implemented a prototype VVT and used the prototype to develop visual tracking, real-time face detection and so forth. Through the implementation and experience of the usage, we have confirmed feasibility and effectiveness of the proposed idea. In this paper, the authors discuss background, required functions and structure of the recorder. Some implementation issues are also described.
We shall show how to extract various important information on the motion of a target (enemy aircraft) from a sequence of its image frames obtained using a single imaging sensor which is mounted on an aircraft. Specifically, we present an algorithm which estimates following 12 parameters of an enemy aircraft from its image sequence; position (three), linear velocity (three), attitude (three), instantaneous angular velocity (three) and optional acceleration (three). To extract the attitude, we use a matching algorithm for the captured image frame to the modeled image. The objective function is the value of correlation between these two images, and we use simulated annealing and downhill simplex methods for maximizing the objective function. Finally, estimation of the position and the at titude is accomplished with a 12- state extended Kalman filter. Through simulations, we show that the proposed algorithm is superior to convention algorithms which do not use the attitude information.
A robust near real-time magnetic resonance imaging (MRI) based guidance scheme has been developed, validated and used for in vivo neurosurgical applications. The key concept of the method is to use tomographical imagine techniques, such as MRI, to facilitate the alignment process of a trajectory guidance device for biopsy needle. Since the trajectory corresponding to the biopsy needle pivoted at an entry point on patient skull has two orientational degrees of freedom, the alignment of the needle can be tracked using a 2-dimensional (2D) imaging plane that is placed perpendicular to the desired trajectory. Using a near real-time visual feedbacak in 2D during the adjustment of the alignment guide, the required trajectory alignment can be translated into a simple targeting task on computer monitor. This MR based guidance technique has practically allowed neurosurgeons to accomplish the required alignment of a surgical device to an aribitrary target accurately in a straight forward procedure on conventional MR scanner. The actual MR-guided biopsy using the new methodology has shown that is has the required targeting accuracy for neurosurgery even in the presence of brain shift. The use of the method in 20 MR-guided brain lesion biopsy procedures can significantly reduce the surgery time, in fact the time required for the needle trajectory alignment is less than 1 min. Furthermore, the post- alignment trajectory can be validated using near real-time MRI scans in two orthogonal views before the needle insertion. In conclusion, this scheme provides a unique alternative of trajectory guidance and monitoring methodology that can take full advantages of the capabilities of modern imaging techniques such as MRI.
In the present work is presented a simple algorithm to detect target moving along runways and taxiways of an airport from images provided by a Surace Movement Radar, even with a very noise image. The aim of the application is to determine the position of aircraft in Advanced Surface Movement Guidance and Control Systems (A-SMGCS). The radar sensor is a prototype operating in the millimetre band (95 Ghz) and conceived for the surveillance function, developed by Oerlikon Contraves Italiana SpA in the framework of the research project on transportation with a grant awarded by the Italian National research Council. The adaptive filter here presented operates by integrating the radar echo over a moving area, filtering the background average value adaptively, and thresholding the result in order to get a fix on the target. The choice of the threshold is a trial and error process which try to find a trade off between the hit rate and the false alarm rate. The filter operate on the composition of two consecutive scans and with completely automatic threshold selections.
There are many kinds of so-called irregular expressions in natural dialogues. Even if the content of a conversation is the same in words, different meanings can be interpreted by a person's feeling or face expression. To have a good understanding of dialogues, it is required in a flexible dialogue processing system to infer the speaker's view properly. However, it is difficult to obtain the meaning of the speaker's sentences in various scenes using traditional methods. In this paper, a new approach for dialogue processing that incorporates information from the speaker's face is presented. We first divide conversation statements into several simple tasks. Second, we process each simple task using an independent processor. Third, we employ some speaker's face information to estimate the view of the speakers to solve ambiguities in dialogues. The approach presented in this paper can work efficiently, because independent processors run in parallel, writing partial results to a shared memory, incorporating partial results at appropriate points, and complementing each other. A parallel algorithm and a method for employing the face information in a dialogue machine translation will be discussed, and some results will be included in this paper.
A multi-tolerance region-growing algorithm for automatically detecting and circumscribing calcifications in digitized mammographic images was developed. Independent studies comparing various segmentation methods showed that the multi-tolerance technique works well. However, the method is computationally expensive due to the checking of the validity of the grown region at every tolerance level until the optimal region is obtained for each calcification. Furthermore, a single mammogram may contain as many as a few hundred calcifications. In order to reduce processing time, the calcification detection algorithm was implemented on a cluster of processors using the message passing interface. In the parallel implementation, the master processor partitions the image via histogram thresholding, and sends seed pixels to the slaves to execute the multi-tolerance region-growing procedure. The slave processors grow regions, calculating a few shape parameters at each tolerance level. The parameters are used to compute distance measures which are compared until the minimum change in distance is achieved. Shape factors are then computed to describe the roughness of each region's final boundary and returned to the master processor. Initial trails have shown a speedup factor of three to eight when comparing the use of 13 slave processors to the use of one slave processor.
The fast algorithm for calculating the bilinear transform in the optical system is proposed. This algorithm is based on the coherent-mode representation of the cross-spectral density function of the illumination. The algorithm is computationally efficient when the illumination is partially coherent. Numerical examples are studied and compared with the theoretical results.
The rapidly increasing popularity of the discrete wavelet transform (DWT) as an effective tool in many signal processing and data compression applications, and its integration into JPEG 2000 has given rise to various DWT algorithms and their VLSI implementations to reduce complexity and enhance performance. In this paper, we present an efficient hardware implementation of the discrete wavelet transform and its deployment on a reconfigurable FPGA based platform. Our implementation is a novel architecture based on the lifting factorization of the wavelet filter banks. This factorization leads to a block based parallel DWT architecture suitable for hardware implementation. To overcome the communication overhead associated with the DWT block transform, we utilize the new Overlap-State1,2 technique to compute the DWT near block boundaries. A VHDL description of the lifting polyphase factorization architecture was developed and ported to an FPGA hardware platform that was chosen to allow partial and full reconfigurability to accommodate various applications with different filter banks. Our hardware implementation improves the performance by better than twofold speed up when compared to an efficient pipelined FPGA based implementation.
This paper deals with the implementation of a systolic array architecture in hardware using FPGAs for processing compressed binary images without decompressing them. Specifically, run-length encoding (RLE) is used for compression. Processing images in compressed form provides a significant speedup in the computation. Using a systolic architecture and implementing it in hardware further increases the speed.
Sustained operation of high average power solid-state lasers currently requires an adaptive resonator to produce the optimal beam quality. We describe the architecture of a real-time adaptive control system for correcting intra-cavity aberrations in a heat capacity laser. Image data collected from a wavefront sensor are processed and used to control phase with a high-spatial-resolution deformable mirror. Our controller takes advantage of recent developments in low-cost, high-performance processor technology. A desktop-based computational engine and object- oriented software architecture replaces the high-cost rack-mount embedded computers of previous systems.
Caching strategies for real-time multimedia systems are investigated in this research. We study the effect between caching algorithms and various consuming rates as a result of different requirements of quality of services (QoS), media types and network configurations. In particular, we examine two models targeting at rate heterogeneity and propose two interval based schemes called PIB and PISB. It is demonstrated by experimental results that PISB, which takes both the file size and the consuming rate into account, performs consistently better. Finally, we extend our discussion to the adaptivity to interactive operations and the relationship between the Internet proxy caching and the memory hierarchy caching.
The current Internet is not designed for real-time applications. There are at least three major problems in developing real-time applications on the Internet: insufficient bandwidth, transmission performance unpredictability, and no support for quality of service. Current research efforts in providing better services on the Internet fall into two approaches. One focuses on enhancing the application layer, and the other modifies the network layer. This paper is based on the first approach, which seems more feasible. Real-time applications have to take adaptive strategies based on the dynamics of network status. The effectiveness of an adaptive strategy depends on whether it can accurately find out the current available bandwidth. A good estimate of the bandwidth can be obtained through a valid flow and congestion control protocol. In a rate-based feedback scheme, the receiver continuously monitors the quality of data stream such as data loss rate and sends back the information to the sender. The sender will then adjust the sending rate based on this information with the goal of minimizing the data loss rate. However, this scheme has at least two potential drawbacks. It may cause network buffer overflow since it controls the data rate rather than buffer size, and it is not easy to select a loss ratio threshold. This paper proposes a new Internet video flow control protocol (IVFCP) which adjusts data sending rate based on the combination of the receiver buffer length, packet loss ration, and current data rate. The flow control protocol runs every round trip instead of periodically or when congestion happens. It can control the rate more directly and precisely. This rate-based feedback control protocol is evaluated through simulation, and its performance is compared with that of other protocols.
Obtaining the listening rates of radio stations in function of time is an important instrument for determining the impact of publicity. Since many radio stations are financed by publicity, the exact determination of radio listening rates is vital to their existence and to further development. Existing methods of determining radio listening rates are based on face to face interviews or telephonic interviews made with a sample population. These traditional methods however require the cooperation and compliance of the participants. In order to significantly improve the determination of radio listening rates, special watches were created which incorporate a custom integrated circuit sampling the ambient sound during a few seconds every minutes. Each watch accumulates these compressed sound samples during one full week. Watches are then sent to an evaluation center, where the sound samples are matched with the sound samples recorded from candidate radio stations. The present paper describes the processing steps necessary for computing the radio listening rates, and shows how this application was parallelized on a cluster of PCs using the CAP Computer-aided parallelization framework. Since the application must run in a production environment, the paper describes also the support provided for graceful degradation in case of transient or permanent failure of one of the system's components. The parallel sound matching server offers a linear speedup up to a large number of processing nodes thanks to the fact that disk access operations across the network are done in pipeline with computations.
A difficult problem in automatic medical image understanding is that for every image type such as x-ray and every body organ such as heart, there exist specific solutions that do not allow for generalization. Just collecting all the specific solutions will not achieve the vision of a computerized physician. To address this problem, we propose an intelligent agent approach that is based on agent-oriented programming is that it combines the benefits of object-oriented programming and expert system. For radiology image understanding, we present a multi- agent system that is composed of two major types of intelligent agents: radiologist agents and patient agents. A patient agent asks for multiple opinions from radiologists agents in interpreting a given set of images and then integrates the opinions. A radiologist agent decomposes the image recognition task into smaller problems that are solved collectively by multiple intelligent sub-agents. Finally, we present a preliminary implementation and running examples of the multi-agent system.
Our purpose is, in medium term, to detect in air images, characteristic shapes and objects such as airports, industrial plants, planes, tanks, truck, ... with great accuracy and low rate of mistakes. However, we also want to value whether the link between neural networks and multi- agents systems is relevant and effective. If it appears to be really effective, we hope to use this kind of technology in other fields. That would be an easy and convenient way to depict and to use the agents' knowledge which is distributed and fragmented. After a first phase of preliminary tests to know if agents are able to give relevant information to a neural network, we verify that only a few agents running on an image are enough to inform the network and let it generalize the agents' distributed and fragmented knowledge. In a second phase, we developed a distributed architecture allowing several multi- agents systems running at the same time on different computers with different images. All those agents send information to a multi neural networks system whose job is to identify the shapes detected by the agents. The name we gave to our project is Jarod.