Attention Deficit Hyperactivity Disorder (ADHD) is receiving lots of attention nowadays mainly because it is
one of the common brain disorders among children and not much information is known about the cause of this
disorder. In this study, we propose to use a novel approach for automatic classification of ADHD conditioned
subjects and control subjects using functional Magnetic Resonance Imaging (fMRI) data of resting state brains.
For this purpose, we compute the correlation between every possible voxel pairs within a subject and over the
time frame of the experimental protocol. A network of voxels is constructed by representing a high correlation
value between any two voxels as an edge. A Bag-of-Words (BoW) approach is used to represent each subject
as a histogram of network features; such as the number of degrees per voxel. The classification is done using
a Support Vector Machine (SVM). We also investigate the use of raw intensity values in the time series for
each voxel. Here, every subject is represented as a combined histogram of network and raw intensity features.
Experimental results verified that the classification accuracy improves when the combined histogram is used.
We tested our approach on a highly challenging dataset released by NITRC for ADHD-200 competition
and obtained promising results. The dataset not only has a large size but also includes subjects from different
demography and edge groups. To the best of our knowledge, this is the first paper to propose BoW approach in
any functional brain disorder classification and we believe that this approach will be useful in analysis of many
brain related conditions.
We present a mathematical technique for estimating new perspective views of an object
from a single image. Unlike traditional graphics or ray tracing methods, our approach
treats the view-morphing problem as a 2-D linear prediction process. We first estimate
the prediction parameters in a reduced dimensional space using features extracted from
"training" images of the object. Given an arbitrary view of the object, the features of the
new view are linearly predicted from which the morphed image of the object is
reconstructed. The proposed approach can be used for rapidly incorporating new objects
in the knowledge base of a computer vision system and may have advantages in low-contrast
situations where it is difficult to establish correspondence between sample views.
In this work we propose a method for securing port facilities which uses a set of video cameras to automatically detect
various vessel classes moving within buffer zones and off-limit areas. Vessels are detected by an edge-enhanced spatiotemporal
optimal trade-off maximum average correlation height filter which is capable of discriminating between vessel
classes while allowing for intra-class variability. Vessel detections are cross-referenced with e-NOAD data in order to
verify the vessel's access to the port. Our approach does not require foreground/background modeling in order to detect
vessels, and therefore it is effective in the presence of the class of dynamic backgrounds, such as moving water, which
are prevalent in port facilities. Furthermore, our approach is computationally efficient, thus rendering it more suitable
for real-time port surveillance systems. We evaluate our method on a dataset collected from various port locations which
contains a wide range of vessel classes.
In this paper, we present the PEGASUS system. PEGASUS is an integrated news video search system with three major components: (1) user interface, where users can formulate search queries and browse the returned results; (2) server, which takes the queries from user interface, performs the searches and ranks the search results before returns them to the users; (3) data storage, which is composed of feature indexing system and video database. The PEGASUS system has the capability to allow users to perform fast multi-modality video search using video features, including features from both audio and visual portions of the videos. To search a target topic, the user first submits an initial query using the prior knowledge on the topic. Then, through a series of relevance feedback processes, a set of relevant video shots are returned by the system to the user. The user is able to further view the results using the video-on-demand (VoD) functionality of the system. The system has been constructed using over 45,000 news video shots, and it is available online of public access.
In this paper we present an algorithm for the autonomous navigation of an unmanned aerial vehicle (UAV) following a moving target. The UAV in consideration is a fixed wing aircraft that has physical constraints on airspeed and maneuverability. The target however is not considered to be constrained and can move in any general pattern. We show a single circular pattern navigation algorithm that works for targets moving at any speed with any pattern where other methods switch between different navigation strategies in different scenarios. Simulation performed takes into consideration that the aircraft also needs to visually track the target using a mounted camera. The camera is also controlled by the algorithm according to the position and orientation of the aircraft and the position of the target. Experiments show that the algorithm presented successfully tracks and follows moving targets.
Unmanned Aerial Vehicles (UAVs) are becoming a core intelligence asset for reconnaissance, surveillance and target tracking in urban and battlefield settings. In order to achieve the goal of automated tracking of objects in UAV videos we have developed a system called COCOA. It processes the video stream through number of stages. At first stage platform motion compensation is performed. Moving object detection is performed to detect the regions of interest from which object contours are extracted by performing a level set based segmentation. Finally blob based tracking is performed for each detected object. Global tracks are generated which are used for higher level processing. COCOA is customizable to different sensor resolutions and is capable of tracking targets as small as 100 pixels. It works seamlessly for both visible and thermal imaging modes. The system is implemented in Matlab and works in a batch mode.
A camera mounted on an aerial vehicle provides an excellent means of
monitoring large areas of a scene. Utilizing several such cameras on
different aerial vehicles allows further flexibility, in terms of
increased visual scope and in the pursuit of multiple targets. The
underlying concept of such co-operative sensing is to use
inter-camera relationships to give global context to 'locally'
obtained information at each camera. It is desirable, therefore,
that the data collected at each camera and the inter-camera
relationship discerned by the system be presented in a coherent
visualization. Since the cameras are mounted on UAVs, large swaths
of areas may be traversed in a short period of time, coherent
visualization is indispensable for applications like surveillance
and reconnaissance. While most visualization approaches have
hitherto focused on data from a single camera at a time, as a
consequence of tracking objects across cameras, we show that widely
separated mosaics can be aligned, both in space and color, for
concurrent visualization. Results are shown on a number of real
sequences, validating our qualitative models.
Automatic target detection and recognition (ATD/R) remains a challenging problem for unmanned and unattended systems. Promising solutions using Electro-Optical sensors such as LADARs, FLIRs, and TVs are evolving. The key issues are not only the performance of the individual sensors, but also the mutual calibration of the sensors and their collective behavior. This paper presents an overview of the challenges encountered in two separate ATD/R scenarios, and the methods that have been proposed for addressing them. Specifically, advanced techniques that exploit multiple views, collaborative sensor behavior and new sensing paradigms are reviewed and the concepts are illustrated by means of several examples.
This paper describes a novel approach to automatically recognize the target based on a view morphing database constructed by our multi-view morphing algorithm. Instead of using single reference image, a set of images or a video sequence is used to construct the reference database, where these images are re-organized by a triangulation of viewing sphere. At the vertex of each triangle, one image is stored in the database as the reference view from a specific viewing direction. For each triangle, our tri-view morphing algorithm can synthesize a high quality image for an arbitrary novel viewpoint amongst three neighboring reference images, and the barycentric blending scheme guarantees the seamless transitions between each neighboring triangles. Using the synthesized images, we apply appearance based recognition technique to recognize the target. In addition, using the proposed method, the pose of the object or camera motion can be approximately estimated. Several examples are demonstrated in the experiments to show that our approach is effective and promising.
There are approximately 261,000 rail crossings in the United
States according to the studies by the National Highway Traffic
Safety Administration (NHTSA) and Federal Railroad Administration
(FRA). From 1993 to 1998, there were over 25,000 highway-rail
crossing incidents involving motor vehicles - averaging 4,167
incidents a year. In this paper, we present a real-time computer
vision system for the monitoring of the movement of pedestrians,
bikers, animals and vehicles at railroad intersections. The video
is processed for the detection of uncharacteristic events,
triggering an immediate warning system. In order to recognize the
events, the system first performs robust object detection and
tracking. Next, a classification algorithm is used to determine
whether the detected object is a pedestrian, biker, group or a
vehicle, allowing inferences on whether the behavior of the object
is characteristic or not. Due to the ubiquity of low cost, low
power, and high quality video cameras, increased computing power
and memory capacity, the proposed approach provides a cost
effective and scalable solution to this important problem.
Furthermore, the system has the potential to significantly
decrease the number of accidents and therefore the resulting
deaths and injuries that occur at railroad crossings. We have
field tested our system at two sites, a rail-highway grade
crossing, and a trestle located in Central Florida, and we present
results on six hours of collected data.
Lockheed Martin and the University of Central Florida (UCF) are jointly investigating the use of a network of COTS video cameras and computers for a variety of security and surveillance operations. The detection and tracking of humans as well as vehicles is of interest. The three main novel aspects of the work presented in this paper are (1) the integration of automatic target detection and recognition techniques with tracking (2) the handover and seamless tracking of objects across a network, and (3) the development of real-time communication and messaging protocols using COTS networking components. The approach leverages the previously developed KNIGHT human detection and tracking system developed at UCF, and Lockheed Martin’s automatic target detection and recognition (ATD/R) algorithms. The work presented in this paper builds on these capabilities for surveillance using stationary sensors, with the goal of subsequently addressing the problem of moving platforms.
Multi-sensor fusion deals with the combination of complementary and sometimes contradictory sensor data into a reliable estimate of the environment to achieve a sum which is better than the parts. Multiple sensors can be used to overcome problems associated with object recognition systems. The introduction of multiple sensors into such a system emphasizes the need for useful methods for combining sensor outputs. Multiple sensors can yield duplicate information that can be used to verify input and possibly to ease the task of object recognition. Since each sensor output contains noise, multiple sensors can be used to determine the same property, but with the consensus of all sensors. We introduce a Bayesian approach for combining sensor outputs that increases the confidence in features supported by multiple sensors and reduces the confidence in unsupported features. This paper describes how feature level input from an arbitrary number of sensors may be combined to make 3-D object recognition more accurate. An example involving features from range, intensity, and tactile is given.