This paper addresses the problem of motion analysis performed from digital data captured by a network of motion sensors distributed over a three-dimensional field of interest. Motion analysis means performing motion detection, motion-oriented classification, estimation, and prediction of kinematic parameters, tracking to build trajectories, and warning of the occurrence of potential abnormalities, incidents or accidents. Kinematic parameters are defined as spatial and temporal positions, velocity, scale and orientation. The entire system can be decomposed into three major components. First, a network of sensors captures and generates all relevant motion information. Second, a tree-structured telecommunication system concentrates all motion information to a data sink or gateway. Third, an Artificial Intelligence (A.I.) in a remote monitoring center processes the entire data stream transmitted from the gateway. The A.I. is composed of three major components: a Simulating Software, a Deep Learning System, and an Expert System. This paper addresses the structural relation between motion sensor network and artificial intelligence in order to display on a screen a complete and real time motion analysis of the events taking place in a three dimensional field of interest. This work will address and compare different motion sensors. The reference network being a network made of motion sensors based on passive photodetection. Other sensor networks of interest are networks based on active detection namely ultrasonic waves (SONAR), microwaves (RADAR) and lasers (LIDAR). A limited amount of video cameras turns out to the unavoidable with any motion sensor net- works either active or passive, distributed or localized. Video cameras are required to produce high resolution images allowing pattern recognition and motion disambiguation. To conclude, a comparison is presented of different distributed systems that perform motion analysis through different potential technologies for motion sensor networks. To efficiently network in real time with an A. I., two main challenging questions are raised that are related first to the motion information structure, and second, to the amount to be transmitted. Distributed passive photodetection sensor networks are optimal solutions for long term indoor or short term outdoor analyses. Active sensor networks are optimal solution to extend long term motion analysis to surrounding outdoors.
This paper addresses the problem of motion analysis performed from digital data captured by a network of motion sensors scattered over a 3D+T field of interest. The digital signals captured in the field are transmitted through a telecommunication network to be processed in a remote monitoring center by an artificial intelligent system based on a dual control involving both deep learning and model based algorithms. Motion analysis proceeds through consecutive steps as detection, motion-oriented classification, parameter estimation, tracking and building trajectories. Eventually, the system can detect and predict abnormalities, incidents and accidents. In all current applications, it is commonly thought that motion analysis is to be performed from data streams captured from multiple video-cameras distributed in a network following a so-called “camera-everywhere” approach. A basic observation of how animal biology proceeds shows that information analyzed by the cortex usually originates from one global eye and from a network of sensors non-uniformly distributed over the the entire skin. Telecommunications are performed through a network of nerves that acts as a bundle of telephone lines that connects each sensor located in the eye and in the skin to specific and dedicated areas of the cortex. This paper shows the relevance of this natural way to perform full motion analysis from a network of motion sensor distributed over a field of interest where the 3D+T motion analysis is performed. Using all video-cameras in a network involves two main drawbacks. First, this setting requires to transmit over a telecommunication network an amount of information of which more than ninety nine percent of the content is useless for motion analysis. Second, additional software algorithms that are consuming time and resources have to be implemented to extract the motion information from the video signal and build the trajectories. Video cameras are in fact useful for both pattern recognition and motion disambiguation, and therefore, should be be limited in number and located at key spots or on robots moving in the field. Eventually, a central station implements the motion analysis algorithm. This paper describes the application with a three layer scheme and compares both approaches namely composed of all video-cameras or all motion sensors. Comparisons are presented between two schemes, namely multiple video cameras and sensor networks, in term of networking and processing the motion information in the remote monitoring center.
This paper addresses the problem of motion analysis performed from digital data captured by a network of motion sensors scattered over a field of interest where 3D+T motion analysis is per- formed. Motion analysis, as referred here for digital signals, proceeds through consecutive steps of detection, motion-oriented classification, parameter estimation and tracking. The scheme proposed in this paper is relevant to applications that can be found in medicine, earth science, surveillance and defense. The major challenges involved in the feasibility of this network are as follows: signal sampling from a sensor network, photodetection and optimal strategy for cope with energy harvesting and wireless communication capabilities. The motion sensors implement wireless communications to some gateway or data sink that relays the collected information to a remote central station. Motion sensors are assigned to catch motion with high sensitive sparsely distributed sensors and to build the trajectories. Other sensors can be added to the system for specific purpose like video camera. Video cameras are assigned to catch high resolution images or videos with densely and regularly distributed sensors to perform pattern classification and recognition. The central station implements the motion analysis algorithm. Motion analysis is performed as a dual control referring to both an accurate model based on theoretical mechanics and an adaptive learning system based on a supervised neural network. This paper describes the effective components of the system which are namely the sensor layer, the telecommunication layer, and the application layer.
This paper addresses the problem of motion analysis performed from digital data captured by a network of motion sensors scattered over a field of interest where 3D+T motion analysis is performed. Motion analysis, as referred here for digital signals, proceeds through consecutive steps of detection, motion-oriented classification, parameter estimation and tracking. The scheme proposed in this paper is relevant to applications that can be found in medicine, earth science, surveillance and defense. The major challenges involved in the feasibility of this network are as follows: signal sampling from a sensor network, photodetection and optimal strategy for cope with energy harvesting and wireless communication capabilities. The motion sensors implement wireless communications to some gateway or data sink that relays the collected information to a remote central station. Motion sensors are assigned to catch motion with high sensitive sparsely distributed sensors and to build the trajectories. Other sensors can be added to the system for specific purpose like video camera. Video cameras are assigned to catch high resolution images or videos with densely and regularly distributed sensors to perform pattern classification and recognition. The central station implements the motion analysis algorithm. Motion analysis is performed as a dual control referring to both an accurate model based on theoretical mechanics and an adaptive learning system based on a supervised neural network. This paper describes the effective components of the system which are namely the sensor layer, the telecommunication layer, and the application layer.
This paper addresses the problem of motion analysis performed in a signal sampled on an irregular grid spread in 3-dimensional space and time (3D+T). Nanosensors can be randomly scattered in the field to form a “sensor network”. Once released, each nanosensor transmits at its own fixed pace information which corresponds to some physical variable measured in the field. Each nanosensor is supposed to have a limited lifetime given by a Poisson-exponential distribution after release. The motion analysis is supported by a model based on a Lie group called the Galilei group that refers to the actual mechanics that takes place on some given geometry. The Galilei group has representations in the Hilbert space of the captured signals. Those representations have the properties to be unitary, irreducible and square-integrable and to enable the existence of admissible continuous wavelets fit for motion analysis. The motion analysis can be considered as a so-called “inverse problem” where the physical model is inferred to estimate the kinematical parameters of interest. The estimation of the kinematical parameters is performed by a gradient algorithm. The gradient algorithm extends in the trajectory determination. Trajectory computation is related to a Lagrangian-Hamiltonian formulation and fits into a neuro-dynamic programming approach that can be implemented in the form of a Q-learning algorithm. Applications relevant for this problem can be found in medical imaging, Earth science, military, and neurophysiology.
This paper presents new results on the tracking of ballistic missiles warheads using spatio-temporal wavelets. Here we focus our attention on handling more general classes of motion, such as acceleration. To accomplish this task the spatio-temporal wavelet transform is adapted to the motion parameters on a frame-by-frame basis. Three different energy densities, associated with velocity, location, and size, have been defined to determine motion parameters. We pointed out that maximizing these energy densities is equivalent to a minimum squared error estimation. Tracking results on synthetically generated image sequences demonstrate the capabilities of the proposed algorithm.
This paper addresses the problem of tracking a ballistic missile warhead. In this scenario, the ballistic missile is assumed to be fragmented into many pieces. The goal of the algorithm presented here is to track the warhead that is among the fragments. It is assumed that images are acquired from an optical sensor located in the interceptor nose cone. This imagery is used by the algorithm to steer the course of interception. The algorithm proposed in this paper is based on continuous spatio-temporal wavelet transforms (CWTs). Two different energy densities of the CWT are used to perform velocity detection and filtering. Additional post-processing is applied to discriminate among objects traveling at similar velocities. Particular attention is given to achieving robust performance on noisy sensor data and under conditions of temporary occlusions. First we introduce the spatio-temporal CWT and stress the relationships with classical orientation filters. Then we describe the CWT- based algorithm for target tracking, and present results on synthetically generated sequences.
This paper aims at presenting the major steps towards the elaboration of an optimum control for video transmissions on ATM network. The paper puts forward the gain in statistical multiplexing to demonstrate that transmitting at variable rates on asynchronous multiplexing links is more efficient than exploiting constant rate on synchronous links. Optimum coding and transmission require to characterize the video sources of information as entropy generator and the develop entropy rate-distortion functions in the coder and the transmission channel. Quantizers and VLC in coding, traffic and queues in transmission multiplexing lead each to performance functions expressing quality in terms of entropy rate; they are respectively the PSNR in function of the output data rate and the cell losses in terms of the network loads. The main advantage of transmitting on variable bit rate channels is to favor the generation of image sequences at constant subjective quality on the coding side, and, the saving of transmission bandwidth through a gain in statistical multiplexing on the network side. Mirror control algorithms can be implemented in the coding end and in the multiplexing nodes to optimally manage the rate-distortion functions.
This paper describes a 3D spatio-temporal coding algorithm for the bit-rate compression of digital-image sequences. The coding scheme is based on different specificities namely, a motion representation with a four-parameter affine model, a motion-adapted temporal wavelet decomposition along the motion trajectories and a signal-adapted spatial wavelet transform. The motion estimation is performed on the basis of four-parameter affine transformation models also called similitude. This transformation takes into account translations, rotations and scalings. The temporal wavelet filter bank exploits bi-orthogonal linear-phase dyadic decompositions. The 2D spatial decomposition is based on dyadic signal-adaptive filter banks with either para-unitary or bi-orthogonal bases. The adaptive filtering is carried out according to a performance criterion to be optimized under constraints in order to eventually maximize the compression ratio at the expense of graceful degradations of the subjective image quality. The major principles of the present technique is, in the analysis process, to extract and to separate the motion contained in the sequences from the spatio-temporal redundancy and, in the compression process, to take into account of the rate-distortion function on the basis of the spatio-temporal psycho-visual properties to achieve the most graceful degradations. To complete this description of the coding scheme, the compression procedure is therefore composed of scalar quantizers which exploit the spatio-temporal 3D psycho-visual properties of the Human Visual System and of entropy coders which finalize the bit rate compression.
The huge increase of the communication services has suggested to investigate new algorithmic avenues to solve the perpetual rate/distortion trade-off in video coding applications. The problem is drawn nowadays according to very severe constraints: bit-rate budgets become smaller and domestic communication networks at 8 Kbit/s. Two approaches are explored. The first consists of extending the standards already developed in image communication by successive improvements and alternatives inside the algorithmic frameworks. The second proposes a complete change of the basic tools and a design of hybrid schemes by introducing some new methodological techniques which have been already validated by the Computer vision community but never introduced, for compatibility or complexity reasons, into the encoding strategies. This tutorial paper presents these different issues and evokes their advantages. Some experiments performed in our laboratory to encode efficiently time-varying image sequences are eventually presented.
This paper aims to describe the digital TV and HDTV codecs as variable bit rate sources to be transmitted on ATM networks. The bit rates will be studied and modeled at different time- scales ranging from the sub-image level (a few microseconds) up to the program level (several hours). At the lowest level corresponding to the microscopic time constants of the network, the cell interarrival times will be considered. At higher levels corresponding to the macroscopic time-scales of the image encoding algorithm, the bit or cell rates will be considered as counting processes over different time spans. To validate the theoretical approach, an experiment has been set up to process with a digital TV codec 25 hours of actual time TV images recorded on D1 tapes. The bit rates and the cell interarrival times have been collected and Statistics will be carried out to validate the proposed models.
This paper describes source models of bit rate for digital TV codecs based on the Discrete Cosine Transformation as decorrelation technique. These models are based on cyclostationary processes. Properties of cyclostationary processes are such that the process is preserved over the whole TV transmission. Therefore, the influence of image Statistics is presented on the coders, the transmission networks, and, the decoders. An emphasis is given on variable bit rate codecs transmitting on ATM networks.