KEYWORDS: Video, Visual process modeling, Detection and tracking algorithms, Visualization, Visual system, Information visualization, Zoom lenses, Error analysis, Human vision and color perception, Neurons
Multiple Object Tracking (MOT) experiments show that human observers can track over several seconds up to five
moving targets among several moving distractors. We extended these studies by designing modified MOT experiments
to investigate the spatio-temporal characteristics of human visuo-cognitive mechanisms for tracking and applied the
findings and insights obtained from these experiments in designing computational multiple object tracking algorithms.
Recent studies indicate that attention both enhances the neural activity of relevant information and suppresses the
irrelevant visual information in the surround. Results of our experiments suggest that the suppressive surround of
attention extends up to 4 deg from the target stimulus, and it takes at least 100 ms to build it. We suggest that when the
attentional windows corresponding to separate target regions are spatially close, they can be grouped to form a single
attentional window to avoid interference originating from suppressive surrounds. The grouping experiment results
indicate that the attentional windows are grouped into a single one when the distance between them is less than 1.5 deg.
Preliminary implementation of the suppressive surround concept in our computational video object tracker resulted in
less number of unnecessary object merges in computational video tracking experiments.
Development and maintenance of unsupervised intelligent activity relies on an active interaction with the environment. Such active exploratory behavior plays an essential role in both the development and adult phases of higher biological systems including humans. Exploration initiates a self-organization process whereby a coherent fusion of different sensory and motor modalities can be achieved (sensory-motor development) and maintained (adult rearrangement). In addition, the development of intelligence depends critically on an active manipulation of the environment. These observations are in sharp contrast with current attempts of artificial intelligence and various neural network models. In this paper, we present a neural network model that combines internal drives and environmental cues to reach behavioral decisions for the exploratory activity. The vision system consists of an ambient and a focal system. The ambient vision system guides eye movements by using nonassociative learning. This sensory based attentional focusing is augmented by a `cognitive' system using models developed for various aspects of frontal lobe function. The combined system has nonassociative learning, reinforcement learning, selective attention, habit formation, and flexible criterion categorization properties.
While temporal properties of the visual system have been the subject of extensive research in psychology, many computational theories are based on steady-state behavior. For example, Marr's theory requires early measurements to be instantaneous. Furthermore, optimizational type approaches to perception are designed around properties of equilibria, and very little attention is devoted to the relevance of trajectories to perceptual experience. Electrophysiological findings however show that visual neurons such as retinal ganglion cells possess strong transient components. Therefore, a fundamental issue in perceptual sciences is the understanding of the relevance of these transient components to visual perception. This study claims that adaptive, nonmonotonic transient properties of early visual units are crucial components in visual processing. An extra-retinal feedback on-center off-surround anatomy is proposed to sharpen the 'blurred output' from the retinal level. Based on theoretical studies of pattern transformation properties of recurrent networks for sustained inputs we propose a global model (including retina and extra-retinal areas) of visual processing where a reset from transient ganglion cells of the retina prevent smearing for moving images. The model provides a theoretical link between hyperacuity (achieved by denser extra-retinal packing and nonlinear contrast enhancement) and visual masking (resulting from inter-layer and intra-channel inhibition mechanisms).
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.