A distributed camera network allows for many compelling applications such as large-scale tracking or event
detection. In most practical systems, resources are constrained. Although one would like to probe every camera
at every time instant and store every frame, this is simply not feasible. Constraints arise from network bandwidth
restrictions, I/O and disk usage from writing images, and CPU usage needed to extract features from the images.
Assume that, due to resource constraints, only a subset of sensors can be probed at any given time unit.
This paper examines the problem of selecting the "best" subset of sensors to probe under some user-specified
objective - e.g., detecting as much motion as possible. With this objective, we would like to probe a camera
when we expect motion, but would not like to waste resources on a non-active camera. The main idea behind our
approach is the use of sensor semantics to guide the scheduling of resources. We learn a dynamic probabilistic
model of motion correlations between cameras, and use the model to guide resource allocation for our sensor
Although previous work has leveraged probabilistic models for sensor-scheduling, our work is distinct in its
focus on real-time building-monitoring using a camera network. We validate our approach on a sensor network of
a dozen cameras spread throughout a university building, recording measurements of unscripted human activity
over a two week period. We automatically learnt a semantic model of typical behaviors, and show that one can
significantly improve effciency of resource allocation by exploiting this model.
Event detection from a video stream is becoming an important and challenging task in surveillance and sentient
systems. While computer vision has been extensively studied to solve different kinds of detection problems over
time, it is still a hard problem and even in a controlled environment only simple events can be detected with a
high degree of accuracy. Instead of struggling to improve event detection using image processing only, we bring
in semantics to direct traditional image processing. Semantics are the underlying facts that hide beneath video
frames, which can not be "seen" directly by image processing. In this work we demonstrate that time sequence
semantics can be exploited to guide unsupervised re-calibration of the event detection system. We present an
instantiation of our ideas by using an appliance as an example--Coffee Pot level detection based on video data--to show that semantics can guide the re-calibration of the detection model.
This work exploits time sequence semantics to detect when re-calibration is required to automatically relearn
a new detection model for the newly evolved system state and to resume monitoring with a higher rate of