This paper describes a 3D vision system that allows the derivation of gripping information for a robot. Its capabilities are demonstrated in a benchmark that consisted in the task of clearing cafeteria trays. The system can handle a fair number of objects with substantial mutual occlusion. The objects can be transparent and specularly reflecting. In order to guarantee true 3D object recognition the system used three cameras. Two of these take `skylines' of the tray with dishes and cutlery pieces on it from the side, and a third camera takes a gray-level image from above. A ROI in each of the skyline contours pertaining to the highest object is analyzed, and features are extracted. Information from all three images is combined in a rule-based evidence accumulation scheme to determine the identity of the object in focus. The same features that voted for this object also allow the determination of its position and orientation. Finally the gripping points are calculated, depending on object, pose, and approach direction.
|