In this work, we provide an overview of vision-based control for perching and grasping for Micro Aerial Vehicles. We investigate perching on at, inclined, or vertical surfaces as well as visual servoing techniques for quadrotors to enable autonomous perching by hanging from cylindrical structures using only a monocular camera and an appropriate gripper. The challenges of visual servoing are discussed, and we focus on the problems of relative pose estimation, control, and trajectory planning for maneuvering a robot with respect to an object of interest. Finally, we discuss future challenges to achieve fully autonomous perching and grasping in more realistic scenarios.
We consider the problem of generating temporally consistent point cloud segmentations from streaming RGB-D data, where every incoming frame extends existing labels to new points or contributes new labels while maintaining the labels for pre-existing segments. Our approach generates an over-segmentation based on voxel cloud connectivity, where a modified k-means algorithm selects supervoxel seeds and associates similar neighboring voxels to form segments. Given the data stream from a potentially mobile sensor, we solve for the camera transformation between consecutive frames using a joint optimization over point correspondences and image appearance. The aligned point cloud may then be integrated into a consistent model coordinate frame. Previously labeled points are used to mask incoming points from the new frame, while new and previous boundary points extend the existing segmentation. We evaluate the algorithm on newly-generated RGB-D datasets.
Semantic perception involves naming objects and features in the scene, understanding the relations between them, and
understanding the behaviors of agents, e.g., people, and their intent from sensor data. Semantic perception is a central
component of future UGVs to provide representations which 1) can be used for higher-level reasoning and tactical
behaviors, beyond the immediate needs of autonomous mobility, and 2) provide an intuitive description of the robot's
environment in terms of semantic elements that can shared effectively with a human operator. In this paper, we
summarize the main approaches that we are investigating in the RCTA as initial steps toward the development of
perception systems for UGVs.
Most of today's robot vehicles are equipped with omnidirectional
sensors which provide surround awareness and easier navigation.
Due to the persistence of the appearance in omnidirectional images,
many global navigation or formation control tasks, instead of using
landmarks or fiducials, they need only reference images of target
positions or objects. In this paper, we study the problem of template
matching in spherical images. The natural transformation of a pattern
on the sphere is a 3D rotation and template matching is the
localization of a target in any orientation given by a reference
image. Unfortunately, the support of the template is space variant on
the Euler angle parameterization. Here we propose a new method
which matches the gradients of the
image and the template, with space-invariant operation.
Using properties of the angular momentum, we have proved
in fact that the gradient correlation can be very easily computed by the
3D Inverse Fourier Transform of a linear combination of spherical
harmonics. An exhaustive search localizes the maximum of this
correlation. Experimental results on real data show a very accurate
localization with a variety of targets. In future work, we plan to
address targets appearing in different scales.