Aerial video surveillance has advanced significantly in recent years, as inexpensive high-quality video cameras and airborne platforms are becoming more readily available. Video has become an indispensable part of military operations and is now becoming increasingly valuable in the civil and paramilitary sectors. Such surveillance capabilities are useful for battlefield intelligence and reconnaissance as well as monitoring major events, border control and critical infrastructure. However, monitoring this growing flood of video data requires significant effort from increasingly large numbers of video analysts. We have developed a suite of aerial video exploitation tools that can alleviate mundane monitoring from the analysts, by detecting and alerting objects and activities that require analysts’ attention. These tools can be used for both tactical applications and post-mission analytics so that the video data can be exploited more efficiently and timely. A feature-based approach and a pixel-based approach have been developed for Video Moving Target Indicator (VMTI) to detect moving objects at real-time in aerial video. Such moving objects can then be classified by a person detector algorithm which was trained with representative aerial data. We have also developed an activity detection tool that can detect activities of interests in aerial video, such as person-vehicle interaction. We have implemented a flexible framework so that new processing modules can be added easily. The Graphical User Interface (GUI) allows the user to configure the processing pipeline at run-time to evaluate different algorithms and parameters. Promising experimental results have been obtained using these tools and an evaluation has been carried out to characterize their performance.
A key component in the emerging localization and mapping paradigm is an appearance-based place recognition
algorithm that detects when a place has been revisited. This algorithm can run in the background at a low
frame rate and be used to signal a global geometric mapping algorithm when a loop is detected. An optimization
technique can then be used to correct the map by 'closing the loop'. This allows an autonomous unmanned ground
vehicle to improve localization and map accuracy and successfully navigate large environments. Image-based
place recognition techniques lack robustness to sensor orientation and varying lighting conditions. Additionally,
the quality of range estimates from monocular or stereo imagery can decrease the loop closure accuracy. Here,
we present a lidar-based place recognition system that is robust to these challenges. This probabilistic framework
learns a generative model of place appearance and determines whether a new observation comes from a new or
previously seen place. Highly descriptive features called the Variable Dimensional Local Shape Descriptors are
extracted from lidar range data to encode environment features. The range data processing has been implemented
on a graphics processing unit to optimize performance. The system runs in real-time on a military research
vehicle equipped with a highly accurate, 360 degree field of view lidar and can detect loops regardless of the
sensor orientation. Promising experimental results are presented for both rural and urban scenes in large outdoor
Airborne surveillance and reconnaissance are essential for many military missions. Such capabilities are critical for troop
protection, situational awareness, mission planning and others, such as post-operation analysis / damage assessment.
Motion imagery gathered from both manned and unmanned platforms provides surveillance and reconnaissance
information that can be used for pre- and post-operation analysis, but these sensors can gather large amounts of video
data. It is extremely labour-intensive for operators to analyse hours of collected data without the aid of automated tools.
At MDA Systems Ltd. (MDA), we have previously developed a suite of automated video exploitation tools that can
process airborne video, including mosaicking, change detection and 3D reconstruction, within a GIS framework. The
mosaicking tool produces a geo-referenced 2D map from the sequence of video frames. The change detection tool
identifies differences between two repeat-pass videos taken of the same terrain. The 3D reconstruction tool creates
calibrated geo-referenced photo-realistic 3D models.
The key objectives of the on-going project are to improve the robustness, accuracy and speed of these tools, and make
them more user-friendly to operational users. Robustness and accuracy are essential to provide actionable intelligence,
surveillance and reconnaissance information. Speed is important to reduce operator time on data analysis. We are
porting some processor-intensive algorithms to run on a Graphics Processing Unit (GPU) in order to improve
throughput. Many aspects of video processing are highly parallel and well-suited for optimization on GPUs, which are
now commonly available on computers.
Moreover, we are extending the tools to handle video data from various airborne platforms and developing the interface
to the Coalition Shared Database (CSD). The CSD server enables the dissemination and storage of data from different
sensors among NATO countries. The CSD interface allows operational users to search and retrieve relevant video data
Airborne surveillance and reconnaissance are essential for successful military missions. Such capabilities are critical for
troop protection, situational awareness, mission planning, damage assessment, and others. Unmanned Aerial Vehicles
(UAVs) gather huge amounts of video data but it is extremely labour-intensive for operators to analyze hours and hours
of received data.
At MDA, we have developed a suite of tools that can process the UAV video data automatically, including mosaicking,
change detection and 3D reconstruction, which have been integrated within a standard GIS framework. In addition, the
mosaicking and 3D reconstruction tools have also been integrated in a Service Oriented Architecture (SOA) framework.
The Visualization and Exploitation Workstation (VIEW) integrates 2D and 3D visualization, processing, and analysis
capabilities developed for UAV video exploitation. Visualization capabilities are supported through a thick-client
Graphical User Interface (GUI), which allows visualization of 2D imagery, video, and 3D models. The GUI interacts
with the VIEW server, which provides video mosaicking and 3D reconstruction exploitation services through the SOA
The SOA framework allows multiple users to perform video exploitation by running a GUI client on the operator's
computer and invoking the video exploitation functionalities residing on the server. This allows the exploitation services
to be upgraded easily and allows the intensive video processing to run on powerful workstations.
MDA provides UAV services to the Canadian and Australian forces in Afghanistan with the Heron, a Medium Altitude
Long Endurance (MALE) UAV system. On-going flight operations service provides important intelligence,
surveillance, and reconnaissance information to commanders and front-line soldiers.
3D imagery has a well-known potential for improving situational awareness and battlespace visualization by
providing enhanced knowledge of uncooperative targets. This potential arises from the numerous advantages
that 3D imagery has to offer over traditional 2D imagery, thereby increasing the accuracy of automatic target
detection (ATD) and recognition (ATR). Despite advancements in both 3D sensing and 3D data exploitation,
3D imagery has yet to demonstrate a true operational gain, partly due to the processing burden of the massive
dataloads generated by modern sensors. In this context, this paper describes the current status of a workbench
designed for the study of 3D ATD/ATR. Among the project goals is the comparative assessment of algorithms
and 3D sensing technologies given various scenarios. The workbench is comprised of three components: a
database, a toolbox, and a simulation environment. The database stores, manages, and edits input data of
various types such as point clouds, video, still imagery frames, CAD models and metadata. The toolbox features
data processing modules, including range data manipulation, surface mesh generation, texture mapping, and
a shape-from-motion module to extract a 3D target representation from video frames or from a sequence of
still imagery. The simulation environment includes synthetic point cloud generation, 3D ATD/ATR algorithm
prototyping environment and performance metrics for comparative assessment. In this paper, the workbench
components are described and preliminary results are presented. Ladar, video and still imagery datasets collected
during airborne trials are also detailed.
Airborne surveillance and reconnaissance are essential for successful military missions. Such capabilities are critical for
force protection, situational awareness, mission planning, damage assessment and others. UAVs gather huge amount of
video data but it is extremely labour-intensive for operators to analyse hours and hours of received data.
At MDA, we have developed a suite of tools towards automated video exploitation including calibration, visualization,
change detection and 3D reconstruction. The on-going work is to improve the robustness of these tools and automate the
process as much as possible. Our calibration tool extracts and matches tie-points in the video frames incrementally to
recover the camera calibration and poses, which are then refined by bundle adjustment. Our visualization tool stabilizes
the video, expands its field-of-view and creates a geo-referenced mosaic from the video frames.
It is important to identify anomalies in a scene, which may include detecting any improvised explosive devices (IED).
However, it is tedious and difficult to compare video clips to look for differences manually. Our change detection tool
allows the user to load two video clips taken from two passes at different times and flags any changes between them.
3D models are useful for situational awareness, as it is easier to understand the scene by visualizing it in 3D. Our 3D
reconstruction tool creates calibrated photo-realistic 3D models from video clips taken from different viewpoints, using
both semi-automated and automated approaches. The resulting 3D models also allow distance measurements and line-of-
Servicing satellites in space requires accurate and reliable 3D information. Such information can be used to create virtual models of space structures for inspection (geometry, surface flaws, and deployment of appendages), estimation of relative position and orientation of a target spacecraft during autonomous docking or satellite capture, replacement of serviceable modules, detection of unexpected objects and collisions. Existing space vision systems rely on assumptions to achieve the necessary performance and reliability. Future missions will require vision systems that can operate without visual targets and under less restricted operational conditions towards full autonomy.
Our vision system uses stereo cameras with a pattern projector and software to obtain reliable and accurate 3D information. It can process images from cameras mounted on a robotic arm end-effector on a space structure or a spacecraft. Image sequences can be acquired during relative camera motion, during fly-around of a spacecraft or motion of the arm. The system recovers the relative camera motion from the image sequence automatically without using spacecraft or arm telemetry. The 3D data computed can then be integrated to generate a calibrated photo-realistic 3D model of the space structure.
Feature-based and shape-based approaches for camera motion estimation have been developed and compared. Imaging effects on specular surfaces are introduced by space materials and illumination. With a pattern projector and redundant stereo cameras, the robustness and accuracy of stereo matching are improved as inconsistent 3D points are discarded. Experiments in our space vision facility show promising results and photo-realistic 3D models of scaled satellite replicas are created.
Instant Scene Modeler (iSM) is a vision system for generating calibrated photo-realistic 3D models of unknown
environments quickly using stereo image sequences. Equipped with iSM, Unmanned Ground Vehicles (UGVs) can
capture stereo images and create 3D models to be sent back to the base station, while they explore unknown
environments. Rapid access to 3D models will increase the operator situational awareness and allow better mission
planning and execution, as the models can be visualized from different views and used for relative measurements.
Current military operations of UGVs in urban warfare threats involve the operator hand-sketching the environment from
live video feed. iSM eliminates the need for an additional operator as the 3D model is generated automatically. The
photo-realism of the models enhances the situational awareness of the mission and the models can also be used for
change detection. iSM has been tested on our autonomous vehicle to create photo-realistic 3D models while the rover
traverses in unknown environments.
Moreover, a proof-of-concept iSM payload has been mounted on an iRobot PackBot with Wayfarer technology, which is
equipped with autonomous urban reconnaissance capabilities. The Wayfarer PackBot UGV uses wheel odometry for
localization and builds 2D occupancy grid maps from a laser sensor. While the UGV is following walls and avoiding
obstacles, iSM captures and processes images to create photo-realistic 3D models. Experimental results show that iSM
can complement Wayfarer PackBot's autonomous navigation in two ways. The photo-realistic 3D models provide better
situational awareness than 2D grid maps. Moreover, iSM also recovers the camera motion, also known as the visual
odometry. As wheel odometry error grows over time, this can help improve the wheel odometry for better localization.
Servicing satellites on-orbit requires ability to rendezvous and dock by an unmanned spacecraft with no or minimum human input. Novel imaging sensors and computer vision technologies are required to detect a target spacecraft at a distance of several kilometers and to guide the approaching spacecraft to contact. Current optical systems operate at much shorter distances, provide only bearing and range towards the target, or rely on visual targets.
Emergence of novel LIDAR technologies and computer vision algorithms will lead to a new generation of rendezvous and docking systems in the near future. Such systems will be capable of autonomously detecting a target satellite at a distance of a few kilometers, estimating its bearing, range and relative orientation under virtually any illumination, and in any satellite pose.
At MDA Space Missions we have developed a proof-of-concept vision system that uses a scanning LIDAR to estimate pose of a known satellite. First, the vision system detects a target satellite, and estimates its bearing and range. Next, the system estimates the full pose of the satellite using a 3D model. Finally, the system tracks satellite pose with high accuracy and update rate. Estimated pose provides information where the docking port is located even if the port is not visible and enables selecting more efficient flight trajectory.
The proof-of-concept vision system has been integrated with a commercial time-of-flight LIDAR and tested using a moving scaled satellite replica in the MDA Vision Testbed.