The owl tracks the mouse, not the blowing leaf. The outfielder tracks the fly ball, not the bird flying by. The jet fighter tracks the strategic missile launcher, not the school bus. Tracking is inexorably tied to object recognition. It is important to track the pickup truck headed straight toward the tactical operation center, not a similar truck headed into the farm field. So, it is not only the identity of the object that is important, but also the activity that the object is engaged in.
This chapter is not the usual tale of tracking point-like targets. The subject being addressed is whether the automatic target tracker (ATT), the automatic target recognizer (ATR), and the activity recognizer (AR) should be treated as independent cooperating modules or should be fused together so tightly and so well that their distinctiveness becomes lost in the merger. The latter approach has historically not been the case outside of biology and a few academic papers. There are a lot of open questions that need to be tackled in the years to come. Is it the low-level statistics that are important or the high-level semantics? Or, to put it another way: Does every picture tell a story? Is tracking the end goal or is it an intermediate task leading to motor control, as in all biological systems? Should single-actor activity recognition be treated as no more than a natural temporal generalization of target recognition? Can complex multi-actor scenarios be discerned using queries to a track file database? Should the ATR and ATT be designed by independent groups, which is often the case, or are they best not considered as separate entities in the broader system design? We will reflect on these issues.
Online access to SPIE eBooks is limited to subscribing institutions.