This paper examines community efforts to enhance usability of motion imagery to support Geospatial Intelligence
(GEOINT) production and analysis. Beginning with a snapshot of "where we are now," the paper will describe potential
technologies for integration into baseline systems. The efforts cover PED chains for both wide area and narrow field of
view sensors, with strong emphasis on workflow and process automation. While automation is key and critical to
slowing down the spiraling problem of data overload, it is recognized that all intelligence problems are not "machine
solvable" and therefore a delicate balance between human and machine must be designed and maintained, in order to
meet critical timelines, achieve desired throughput, and get the most value out of the massive amounts of data collected.
The community continues to seek innovative ways to package and deliver enhanced analytic capability.
Over the last decade, intelligence capabilities within the Department of Defense/Intelligence Community (DoD/IC) have
evolved from ad hoc, single source, just-in-time, analog processing; to multi source, digitally integrated, real-time
analytics; to multi-INT, predictive Processing, Exploitation and Dissemination (PED). Full Motion Video (FMV)
technology and motion imagery tradecraft advancements have greatly contributed to Intelligence, Surveillance and
Reconnaissance (ISR) capabilities during this timeframe. Imagery analysts have exploited events, missions and high
value targets, generating and disseminating critical intelligence reports within seconds of occurrence across operationally
significant PED cells. Now, we go beyond FMV, enabling All-Source Analysts to effectively deliver ISR information in
a multi-INT sensor rich environment. In this paper, we explore the operational benefits and technical challenges of an
Activity Based Intelligence (ABI) approach to FMV PED. Existing and emerging ABI features within FMV PED
frameworks are discussed, to include refined motion imagery tools, additional intelligence sources, activity relevant
content management techniques and automated analytics.
In this contribution, we propose the use of eye tracking technology to support video analysts. To reduce workload, we
implemented two new interaction techniques as a substitute for mouse pointing: gaze-based selection of a video of
interest from a set of video streams, and gaze-based selection of moving targets in videos. First results show that the
multi-modal interaction technique gaze + key press allows the selection of fast moving objects in a more effective way.
Moreover, we discuss further application possibilities like gaze behavior analysis to measure the analyst's fatigue, or
analysis of the gaze behavior of expert analysts to instruct novices.
Georegistration is the assignment of geospatial coordinates to the pixels of an image. It is necessary for many activitybased
intelligence tasks. Motion imagery can be difficult to georegister due to wide sensor fields of view and parallax
from terrain elevations and buildings. We have developed a fully automated and accurate solution to the georegistration
problem that runs in real time on a PC. It works by generating and registering predicted images from digital elevation
models and fitting the parameters of the camera sensor model, including exterior and interior orientation. To estimate the
geospatial accuracy, the algorithm employs rigorous error propagation techniques from the field of photogrammetry. We
present results on a variety of aerial motion imagery, including full motion video and multi-camera wide area motion
imagery. We also present error propagation results and comparisons with ground truth.
Emerging standards for video metadata provide the means, in principle, for accurate geopositioning from full motion
video. Georegistration to reference data as part of the workflow adds value by improving the metadata accuracy,
establishing a check against mismodeling in the metadata and the corresponding a priori error covariance, and providing
a mechanism to recover usable geopositioning capability in the event of failure of the system generating or transmitting
the metadata. Georegistration may be done on board the collecting platform, at a ground station, or at any point in the
exploitation process. A system capable of full motion video georegistration to reference data will be described, which
establishes a photogrammetrically rigorous sensor model for each video frame. The sensor model operating parameters
and error covariance are updated based on matches between pairs of frames and between frames and reference data. The
challenge of finding associations between the reference data and the video images taken under very different imaging
conditions is met by using both direct and feature matching approaches. Methodology for the validation of
georegistration will be presented. Test results will be given for an operational real-time video georegistration system.
As defense and intelligence agencies seek to use the increasing amount of available data to make mission critical
decisions on the battlefield, there is heavy emphasis on smart data and imagery collection: the capture, storage, and
analysis necessary to drive real-time intelligence. This reality leads to an inevitable challenge-warfighters are
increasingly swimming in sensors and drowning in data. With the millions, if not billions, of sensors in place that
provide all-seeing reports of the combat environment, managing and tackling the overload is critical. This session
highlights the capabilities of file systems and storage technologies that can interactively manage 100M+ files and 1PB+
single directory file systems.
Cloud computing with storage virtualization and new service-oriented architectures brings a new perspective to the
aspect of a distributed motion imagery and persistent surveillance enterprise. Our existing research is focused mainly on
content management, distributed analytics, WAN distributed cloud networking performance issues of cloud based
technologies. The potential of leveraging cloud based technologies for hosting motion imagery, imagery and analytics
workflows for DOD and security applications is relatively unexplored. This paper will examine technologies for
managing, storing, processing and disseminating motion imagery and imagery within a distributed network
environment. Finally, we propose areas for future research in the area of distributed cloud content management
Emerging cloud computing platforms offer an ideal opportunity for Intelligence, Surveillance, and Reconnaissance (ISR)
intelligence analysis. Cloud computing platforms help overcome challenges and limitations of traditional ISR
architectures. Modern ISR architectures can benefit from examining commercial cloud applications, especially as they
relate to user experience, usage profiling, and transformational business models. This paper outlines legacy ISR
architectures and their limitations, presents an overview of cloud technologies and their applications to the ISR
intelligence mission, and presents an idealized ISR architecture implemented with cloud computing.
Video has been a game-changer in how US forces are able to find, track and defeat its adversaries.
With millions of minutes of video being generated from an increasing number of sensor platforms,
the DOD has stated that the rapid increase in video is overwhelming their analysts. The manpower
required to view and garner useable information from the flood of video is unaffordable, especially
in light of current fiscal restraints. "Search" within full-motion video has traditionally relied on
human tagging of content, and video metadata, to provision filtering and locate segments of interest,
in the context of analyst query. Our approach utilizes a novel machine-vision based approach to
index FMV, using object recognition & tracking, events and activities detection. This approach
enables FMV exploitation in real-time, as well as a forensic look-back within archives. This
approach can help get the most information out of video sensor collection, help focus the attention of
overburdened analysts form connections in activity over time and conserve national fiscal resources
in exploiting FMV.
Airborne surveillance and reconnaissance are essential for many military missions. Such capabilities are critical for troop
protection, situational awareness, mission planning and others, such as post-operation analysis / damage assessment.
Motion imagery gathered from both manned and unmanned platforms provides surveillance and reconnaissance
information that can be used for pre- and post-operation analysis, but these sensors can gather large amounts of video
data. It is extremely labour-intensive for operators to analyse hours of collected data without the aid of automated tools.
At MDA Systems Ltd. (MDA), we have previously developed a suite of automated video exploitation tools that can
process airborne video, including mosaicking, change detection and 3D reconstruction, within a GIS framework. The
mosaicking tool produces a geo-referenced 2D map from the sequence of video frames. The change detection tool
identifies differences between two repeat-pass videos taken of the same terrain. The 3D reconstruction tool creates
calibrated geo-referenced photo-realistic 3D models.
The key objectives of the on-going project are to improve the robustness, accuracy and speed of these tools, and make
them more user-friendly to operational users. Robustness and accuracy are essential to provide actionable intelligence,
surveillance and reconnaissance information. Speed is important to reduce operator time on data analysis. We are
porting some processor-intensive algorithms to run on a Graphics Processing Unit (GPU) in order to improve
throughput. Many aspects of video processing are highly parallel and well-suited for optimization on GPUs, which are
now commonly available on computers.
Moreover, we are extending the tools to handle video data from various airborne platforms and developing the interface
to the Coalition Shared Database (CSD). The CSD server enables the dissemination and storage of data from different
sensors among NATO countries. The CSD interface allows operational users to search and retrieve relevant video data
This paper presents an approach to detect visually salient patches in order to use them for visual place recognition.
We formulate the saliency detection problem as an optimization problem, and define an energy function which
describes the distinctiveness of a given image patch. We employ a Branch & Bound based search technique to
efficiently find the global optimum of the energy function. Moreover, we use integral images to further increase
the efficiency of the approach. The proposed saliency detection technique is able to detect patches which are
suitable to be used as visual landmarks, and it performs with very high efficiency.
When viewing full motion video (FMV) from an unmanned aerial vehicle, the "context" of the video (the location
and orientation of objects within the video) is often as important to the end-user as the video itself. To provide
context to video being collected in real-time, we have developed a system for placing frames from a FMV stream
in a geographic context. As a visualization platform, we utilize Pursuer, a US Air Force "government-o-the-
shelf" system based on NASA's World Wind software package. Pursuer provides an intuitive interface for viewing
several dierent layers of imagery, including pre-existing maps, reference imagery, and recently collected imagery,
all placed within geographical context (similar to Google Earth). The focus of this paper is the technology
developed for creating a Pursuer layer for FMV streams. We present results obtained from small UAV
Florida and New York and discuss needed future improvements.
We present a method for segmenting FMV video streams to dynamically extract scene recognition and change detection
information using simple on-the-fly statistics. We show how the video scene can be segmented enabling sub-frame
statistical characterization. The features are written into dynamic look-up tables (LUTs) in real-time. Behavior
recognition occurs by testing if the newly observed scene statistics have already been recorded in the table. The features
in the LUT can later be used to derive predictive behavior Data Models. We demonstrate results of our approach on
various types of FMV and micro UAV video data streams.
Improvement in sensor technology such as charge-coupled devices (CCD) as well as constant incremental improvements
in storage space has enabled the recording and storage of video more prevalent and lower cost than ever before.
However, the improvements in the ability to capture and store a wide array of video have required additional manpower
to translate these raw data sources into useful information. We propose an algorithm for automatically detecting
anomalous movement patterns within full motion video thus reducing the amount of human intervention required to
make use of these new data sources. The proposed algorithm tracks all of the objects within a video sequence and
attempts to cluster each object's trajectory into a database of existing trajectories. Objects are tracked by first
differentiating them from a Gaussian background model and then tracked over subsequent frames based on a
combination of size and color. Once an object is tracked over several frames, its trajectory is calculated and compared
with other trajectories earlier in the video sequence. Anomalous trajectories are differentiated by their failure to cluster
with other well-known movement patterns. Adding the proposed algorithm to an existing surveillance system could
increase the likelihood of identifying an anomaly and allow for more efficient collection of intelligence data.
Additionally, by operating in real-time, our algorithm allows for the reallocation of sensing equipment to those areas
most likely to contain movement that is valuable for situational awareness.
The last decade has seen the emergence of Wide Area ISR systems such as Gorgon Stare and ARGUS. Wide Area ISR
sensor systems have many times the pixel count of a high definition Full Motion Video (FMV) sensor. Besides the effect
of data overload, the scale of Wide Area systems has exposed a need for scale sensitive standards. Presented here is a
survey of the current state of Wide Area system standards in the areas of data / file format, archive query interface,
streaming, and live sensor control. Areas of standardization success and areas for further improvement are identified.
In many situations, the difference between success and failure comes down to taking the right actions quickly. While the
myriad of electronic sensors available today can provide data quickly, it may overload the operator; where only a
contextualized centralized display of information and intuitive human interface can help to support the quick and
effective decisions needed. If these decisions are to result in quick actions, then the operator must be able to understand
all of the data of his environment. In this paper we present a novel approach in contextualizing multi-sensor data onto a
full motion video real-time 360 degree imaging display. The system described could function as a primary display
system for command and control in security, military and observation posts. It has the ability to process and enable
interactive control of multiple other sensor systems. It enhances the value of these other sensors by overlaying their
information on a panorama of the surroundings. Also, it can be used to interface to other systems including: auxiliary
electro-optical systems, aerial video, contact management, Hostile Fire Indicators (HFI), and Remote Weapon Stations
Automated motion image-based tracking is an increasingly important tool in Intelligence, Surveillance, and
Reconnaissance (ISR). Unfortunately, current tracking technology is not up to the performance levels needed to
deliver key subtasks in this arena. We postulate that the under-performance of automated trackers derives from the
under-exploitation of the rich sets of features related to the identification of items being tracked. To address, this we
previously proposed a probabilistic formulation of features that supports easy exchange and integration of new
features. This paper provides a deeper specification of the formulation. In particular, we employ the Open
Geospatial Consortium (OGC) Observations and Measurements (O&M) standard for the new specification. We use
OGC O&M to describe non-parametric distributions of image features with respect to the entity resolution problem
within feature-based tracking. An example is presented. We believe this approach will provide the foundations for a
far wider and more effective exploration of potential features related to tracking, and as a result, will result in
significantly better and more sustainable growth in tracker performance.
Automated moving object detection and tracking are increasingly viewed as solutions to the enormous data volumes
resulting from emerging wide-area persistent surveillance systems. In a previous paper we described a Motion
Imagery Standards Board (MISB) initiative to help address this problem: the specification of a micro-architecture for
the automatic extraction of motion indicators and tracks. This paper reports on the development of an extended
specification of the plug-and-play tracking micro-architecture, on its status as an emerging standard across DoD, the
Intelligence Community, and NATO.
Covert and overt video collection systems as well as tactical unmanned aerial vehicles (UAV's) and
unmanned ground vehicles (UGV's) can deliver real-time video intelligence direct from sensor
systems to command staff providing unprecedented situational awareness and tactical advantage.
Today's tactical video communications system must be secure, compact, lightweight, and fieldable
in quick reaction scenarios. Four main technology implementations can be identified with the
evolutionary development of wireless video transmission systems. Analog FM led to single carrier
digital modulation, which gave way to multi-carrier orthogonal modulation. Each of these systems
is currently in use today. Depending on the operating environment and size, weight, and power
limitations, a system designer may choose one over another to support tactical video collection
We introduce Salience-Based Compression (SBC), a vision-guided pre-filtering technology, coupled with standardsbased
video coding. SBC works by detecting and tracking salient features and keeping them sharp; non-salient features
are lowpass filtered, causing an automatic and beneficial drop in bit rate. Because salience-based pre-filtering is
performed as a pre-processing step, it can interface to any COTS video encoder, thus enabling use in existing
infrastructures and ensuring the compliance of the video bitstream that is produced. For typical aerial surveillance video,
SBC can reduce bit rate by up to a factor of four, yet still provide full motion video (FMV) and preserve salient visual