Algorithms for synergistically fusing acoustic and optical sensory inputs, thereby mimicking biological attentional
processes are described. Manual existing perimeter defense surveillance systems using more than one sensory modality
combine different sensors' information to corroborate findings by other sensors and to add data from a second modality.
In contrast to how conventional systems work, animals use information from multiple sensory inputs in a way that
improves each sensory system's performance. We demonstrated that performance is enhanced when information in one
modality is used to focus processing in the other modality (a form of attention). This synergistic bi-modal operation
improves surveillance efficacy by focusing auditory and visual "attention" on a particular target or location.
Algorithms for focusing auditory and visual sensors using detection information were developed. These
combination algorithms perform "zoom-with-enhanced-acuity" in both the visual and auditory domains, triggered by
detection in either domain. Sensory-input processing algorithms focus on specific locations, indicated by at least one of
the modalities. This spatially focused processing emulates biological attention-driven focusing. We showed that given
information about the target, the acoustic algorithms were able to achieve over 80% correct target detection at signal-tonoise
ratios (SNRs) of -20 dB and above, as compared with similar performance at SNRs of -10 db and above without
target information from another modality. Similarly, the visual algorithm achieved performance of over 80% detection
with added noise variance of 0.001 without target indication, but maintained 100% detection at added noise variance of
0.05 when acoustic target information was taken into account.
The Loss Cone Imager (LCI) will sample the energetic-particle pitch-angle distributions relative to the local geomagnetic field vector in the magnetosphere as a part of the Demonstration and Science Experiment (DSX) satellite. A description of the LCI electrical interfaces and data flow will be presented. The pitch angle and energy of energetic particles are recorded by the FSH (Fixed Sensor Head) and HST (High Sensitivity Telescope) sensor electronics using
solid state detectors. Energetic particle data must be extracted from the FSH and HST by the DPU (Data Processing Unit) and stored in a format that is practical for ground data analysis. The DPU must generate a data packet that is sent to the experiment computer containing science and housekeeping data, as well as receive ground and time commands from the experiment computer. The commands are used to configure the sensor electronics and change the data
acquisition periods of the science data. The instrument works in conjunction with the WIPER (Wave-Induced Precipitation of Electron Radiation) VLF (Very Low Frequency) transmitter on the DSX satellite to view the effects of VLF waves injected in the Earth's magnetic field on the precipitation of electrons into the Loss Cone. The system is designed to operate autonomously with the changing state of the transmitter to provide more appropriate data for examining the effects of the VLF transmitter.
Increasing battlefield awareness can improve both the effectiveness and timeliness of response in hostile military
situations. A system that processes acoustic data is proposed to handle a variety of possible applications. The front-end
of the existing biomimetic acoustic direction finding system, a mammalian peripheral auditory system model, provides
the back-end system with what amounts to spike trains. The back-end system consists of individual algorithms tailored to
extract specific information. The back-end algorithms are transportable to FPGA platforms and other general-purpose
computers. The algorithms can be modified for use with both fixed and mobile, existing sensor platforms.
Currently, gunfire classification and localization algorithms based on both neural networks and pitch are being developed
and tested. The neural network model is trained under supervised learning to differentiate and trace various gunfire
acoustic signatures and reduce the effect of different frequency responses of microphones on different hardware
platforms. The model is being tested against impact and launch acoustic signals of various mortars, supersonic and
muzzle-blast of rifle shots, and other weapons. It outperforms the cross-correlation algorithm with regard to
computational efficiency, memory requirements, and noise robustness. The spike-based pitch model uses the times
between successive spike events to calculate the periodicity of the signal. Differences in the periodicity signatures and
comparisons of the overall spike activity are used to classify mortar size and event type. The localization of the gunfire
acoustic signals is further computed based on the classification result and the location of microphones and other
parameters of the existing hardware platform implementation.
Limited autonomous behaviors are fast becoming a critical capability in the field of robotics as robotic applications are
used in more complicated and interactive environments. As additional sensory capabilities are added to robotic
platforms, sensor fusion to enhance and facilitate autonomous behavior becomes increasingly important. Using biology
as a model, the equivalent of a vestibular system needs to be created in order to orient the system within its environment
and allow multi-modal sensor fusion.
In mammals, the vestibular system plays a central role in physiological homeostasis and sensory information integration
(Fuller et al, Neuroscience 129 (2004) 461-471). At the level of the Superior Colliculus in the brain, there is multimodal
sensory integration across visual, auditory, somatosensory, and vestibular inputs (Wallace et al, J Neurophysiol 80
(1998) 1006-1010), with the vestibular component contributing a strong reference frame gating input. Using a simple
model for the deep layers of the Superior Colliculus, an off-the-shelf 3-axis solid state gyroscope and accelerometer was
used as the equivalent representation of the vestibular system. The acceleration and rotational measurements are used to
determine the relationship between a local reference frame of a robotic platform (an iRobot Packbot®) and the inertial
reference frame (the outside world), with the simulated vestibular input tightly coupled with the acoustic and optical
inputs. Field testing of the robotic platform using acoustics to cue optical sensors coupled through a biomimetic
vestibular model for "slew to cue" gunfire detection have shown great promise.
Robotics are rapidly becoming an integral tool on the battlefield and in homeland security, replacing humans in
hazardous conditions. To enhance the effectiveness of robotic assets and their interaction with human operators, smart
sensors are required to give more autonomous function to robotic platforms. Biologically inspired sensors are an
essential part of this development of autonomous behavior and can increase both capability and performance of robotic
Smart, biologically inspired acoustic sensors have the potential to extend autonomous capabilities of robotic
platforms to include sniper detection, vehicle tracking, personnel detection, and general acoustic monitoring. The key to
enabling these capabilities is biomimetic acoustic processing using a time domain processing method based on the neural
structures of the mammalian auditory system. These biologically inspired algorithms replicate the extremely adaptive
processing of the auditory system yielding high sensitivity over broad dynamic range. The algorithms provide
tremendous robustness in noisy and echoic spaces; properties necessary for autonomous function in real world acoustic
environments. These biomimetic acoustic algorithms also provide highly accurate localization of both persistent and
transient sounds over a wide frequency range, using baselines on the order of only inches.
A specialized smart sensor has been developed to interface with an iRobot Packbot® platform specifically to
enhance its autonomous behaviors in response to personnel and gunfire. The low power, highly parallel biomimetic
processor, in conjunction with a biomimetic vestibular system (discussed in the companion paper), has shown the
system's autonomous response to gunfire in complicated acoustic environments to be highly effective.
We are developing low-power microcircuitry that implements classification and direction finding systems of very small
size and small acoustic aperture. Our approach was inspired by the fact that small mammals are able to localize sounds
despite their ears may be separated by as little as a centimeter. Gerbils, in particular are good low-frequency localizers,
which is a particularly difficult task, since a wavelength at 500 Hz is on the order of two feet. Given such signals, crosscorrelation-
based methods to determine direction fail badly in the presence of a small amount of noise, e.g. wind noise
and noise clutter common to almost any realistic environment. Circuits are being developed using both analog and
digital techniques, each of which process signals in fundamentally the same way the peripheral auditory system of
mammals processes sound. A filter bank represents filtering done by the cochlea. The auditory nerve is implemented
using a combination of an envelope detector, an automatic gain stage, and a unique one-bit A/D, which creates what
amounts to a neural impulse. These impulses are used to extract pitch characteristics, which we use to classify sounds
such as vehicles, small and large weaponry from AK-47s to 155mm cannon, including mortar launches and impacts. In
addition to the pitchograms, we also use neural nets for classification.
Biomimetic signal processing that is functionally similar to that performed by the mammalian peripheral auditory system
consists of several stages. The concatenated stages of the system each favor differing types of hardware
implementations. Ideally, the front-end would be an implementation of the mammalian cochlea, which is a tapered
nonlinear, traveling-wave amplifier. It is not a good candidate for standard digital implementations. The AM
demodulator can be implemented using digital or analog designs. The Automatic Gain Control (AGC) stage is highly
unusual. It requires filtering and multiplication in a closed-loop configuration, with bias added at each of two
concatenated stages. Its implementation is problematic in DSP, FPGA, full custom digital VLSI, and analog VLSI. The
one-bit A/D (also called the "spiking neuron"), while simple at face value, involves a complicated triggering mechanism,
which is amenable to DSP, FPGA, and custom digital but computationally intense, and is suited to an analog VLSI
Currently, we have several hardware embodiments of the biomimetic system. The RedOwl application occupies about
160 cubic inches in volume at the present time. A DSP approach can compute 15 channels for two ears for three A/D
categories using Analog Devices Tiger SHARC-201 DSP chips within a system size estimated to be on the order of 30
cubic inches. BioMimetic Systems, Inc., a Boston University startup company is developing an FPGA solution. Within
the university, we are also pursuing both a custom digital ASIC route and a current-mode analog ASIC.
This paper describes the flow of scientific and technological achievements beginning with a stationary "small, smart,
biomimetic acoustic processor" designed for DARPA that led to a program aimed at acoustic characterization and
direction finding for multiple, mobile platforms. ARL support and collaboration has allowed us to adapt the core
technology to multiple platforms including a Packbot robotic platform, a soldier worn platform, as well as a vehicle
platform. Each of these has varying size and power requirements, but miniaturization is an important component of the
program for creating practical systems which we address further in companion papers. We have configured the system to
detect and localize gunfire and tested system performance with live fire from numerous weapons such as the AK47, the
Dragunov, and the AR15. The ARL-sponsored work has led to connections with Natick Labs and the Future Force
Warrior program, and in addition, the work has many and obvious applications to homeland defense, police, and civilian needs.
In this paper a real-time sound source localizing system is proposed, which is based on previously developed
mammalian auditory models. Traditionally, following the models, which use interaural time delay (ITD) estimates,
the amount of parallel computations needed by a system to achieve real-time sound source localization is a limiting
factor and a design challenge for hardware implementations. Therefore a new approach using a time-shared architecture
implementation is introduced.
The proposed architecture is a purely sample-base-driven digital system, and it follows closely the continuous-time
approach described in the models. Rather than having dedicated hardware on a per frequency channel basis, a specialized
core channel, shared for all frequency bands is used. Having an optimized execution time, which is much less than the
system's sample rate, the proposed time-shared solution allows the same number of virtual channels to be processed as
the dedicated channels in the traditional approach. Hence, the time-shared approach achieves a highly economical and
flexible implementation using minimal silicon area. These aspects are particularly important in efficient hardware
implementation of a real time biomimetic sound source localization system.
A CMOS electronics driver chip to control a deformable MEMS mirror has been developed. With the advances in CMOS technology, it has become possible to design and fabricate electronics operable at higher voltages than those in traditional integrated circuits. Since MEMS structures require relatively high operating voltages to achieve electrostatic force, these high voltage CMOS processes offer promise for miniaturization of the corresponding drivers. Using the capability of low voltage logic together with high voltage output stages, a compact driver chip has been designed and fabricated. The chip was developed and fabricated though a high voltage CMOS process. The driver is digitally controlled through address and data input bits, and through a smart low-voltage to high-voltage transition output stage, voltages of up to 300V are output to each mirror electrode. A compact design allows the control of 144 channels through a single chip with 8-bit resolution at 100Hz refresh rate. The low-voltage stage consists of address logic together with latch stages to store the data, which in turn is converted to a high voltage signal through a current mode, binary weighted scheme. This technique combines the digital-to-analogue conversion stage and a high-voltage amplifier stage, thus saving on substrate area. Using this method, the 144 channel high-voltage driver was fabricated on a single chip less than 3.5cm<sup>2</sup> in area. In this paper, design, fabrication and testing of these drivers are reported.