Video-based tracking of small targets in a dense environment of clutter is very difficult, because the image
resolution of the target is too low to provide reliable information for matching, and in turn the clutter generates
a large number of false positive matches and distractions. Most traditional methods attempt to oppose the target
to the environment, and are thus confronted in handling the enormous distractions. In fact, a target is rarely
isolated and independent to the environment, e.g., when persistent disturbances are present in the vicinity of the
target. Therefore, there may exist some objects that exhibit short-term or even longer-term motion correlation
to the target. They constitute a very useful spatial contexts of the target. Thus, taking the advantage of the
contextual information in an efficient way can improve the robustness of target tracking, as the spatial contexts
provide extra constraints in target matching and additional verification in data association. This paper presents
a new approach of context-aware tracking for small targets, in which a set of motion-correlated auxiliary objects
are automatically discovered on-the-fly. The image region of one such auxiliary object generates a specific spatial
context of the target, and leads to an individual contextual constraint to the motion of the target. Under the
small motion assumption on two consecutive frames, these individual contextual constraints have linear forms.
The collection of all such individual contextual constraints gives a contextual system, based on which the target
motion can be accurately estimated so that the association of the target over consecutive image frames can be
reliably constructed. This new approach is computationally efficient. Extensive experiments on real test video
sequences show the effectiveness and efficiency of the proposed approach.
Multiple target tracking in video is an important problem in many emerging applications. It is also a challenging problem, where the coalescence phenomenon often happens, meaning the tracker associates more than one trajectories to some targets while loses track for others. This coalescence may result in the failure of tracker, especially when similar targets move close or present partial or complete occlusions. Existing approaches are mainly based on joint state space representation of the multiple targets being tracked, therefore confronted by the combinatorial complexity due to the nature of the intrinsic high dimensionality. In this paper, we propose a novel distributed framework with linear complexity to this problem. The basic idea is a collaborative inference mechanism, where the estimate of each individual target state is not only determined by its own observation and dynamics, but also through the interaction and collaboration with the state estimates of other targets, which finally leads to a competition mechanism that enables different but spatial adjacent targets to compete for the common image observations. The theoretical foundation of the new approach is based on a well designed Markov network, where the structure configuration in this network can change with time. In order to inference from such a Markov network, a probabilistic variational analysis of this Markov network is conducted and reveals a mean field approximation to the posterior density of each target, therefore provides a computationally efficient way for such a difficult inference problem. Compared with the existing solutions, the proposed new approach stands out by its linear computational cost and excellent performance achieved to deal with the coalescence problem, as pronounced in the extensive experiments.