Visual tracking is a basic skill needed for active vision, traffic surveillance and many robotic applications. The inherent dynamics are studied in this paper. The result is that an optimum in dynamic tracking performance is reached when the time to process the image is equal to the necessary time of sampling the image. This is valid for all systems using conventional CCD cameras and area windows, which practically all present approaches do. This optimum applies to tracking within a camera as well as to the active vision approach of steering the camera. It is valid for square, circular, and multiple windows. It is further shown that reducing sampling time can improve tracking only to a certain degree with represent technology cameras.