In this paper we examine two fundamental problems related to object recognition for point features under
full perspective projection. The first problem involves the geometric constraints (object-image equations) that
must hold between a set of object feature points (object configuration) and any image of those points under
a full perspective projection, which is just a pinhole camera model for image formation. These constraints are
formulated in an invariant way, so that object pose, image orientation, or the choice of coordinates used to
express the feature point locations either on the object or in the image are irrelevant. These constraints turn out
to be expressions in the shape coordinates calculated from the feature point coordinates. The second problem
concerns the notion of shape and a description of the resulting shape spaces. These spaces aquire certain natural
metrics, but the metrics are often hard to compute. We will discuss certain cases where the computations are
managable, but will leave the general case to a future paper.
Taken all together, the results in this paper provide a way to understand the relationship that exists between
3D geometry and its "residual" in a 2D image. This relationship is completely characterized (for a particular
combination of features) by the above set of fundamental equations in the 3D and 2D shape coordinates. The
equations can be used to test for the geometric consistency between an object and an image. For example, by
fixing point features on a known object, we get constraints on the 2D shape coordinates of possible images of
those features. Conversely, if we have specific 2D features in an image, we will get constraints on the 3D shape
coordinates of objects with feature points capable of producing that image. This yields a test for which object is
being viewed. The object-image equations are thus a fundamental tool for attacking identification/recognition
problems in computer vision and automatic target recognition applications.
|