It has been shown that digital algebraic surfaces can be characterized by inequality conditions that follow from Helly's Theorem on convex sets. As a result, we can recognize digital algebraic surfaces by examining the validity of large collections of inequalities. These inequality conditions can be regarded as a natural extension of the chord property which has been proved by Rosenfeld for digital straight lines. In this paper we show that these inequalities can also be used to measure an absolute value distance. They can be used for example, to measure how far a digital set is from being digitally straight. Since the collection of measurements that must be performed to measure the absolute value distance can be very large, it makes sense to study the mathematical structure of such a collection. We show that it has the structure of a polynomial ideal. For digital straight lines this ideal is generated by a single polynomial.
The difficulties met in the study of digital planes come from a bad definition. We introduce a diophantine point of view, which, despite its formal appearance, is much more convenient and fits perfectly with the analogous notion of digital lines. We give algorithms for the drawing of digital lines and planes and solve the problem of their intersection.
Starting with the intuitive concept of `nearness' as a binary relation, semi-proximity spaces (sp-spaces) are defined. The restrictions on semi-proximity spaces are weaker than the restrictions on topological proximity spaces. Nevertheless, semi-proximity spaces generalize classical topological spaces. Moreover, it is possible to describe all digital pictures used in computer vision and computer graphics as non-trivial semi-proximity spaces, which is not possible in classical topology. Therefore, we use semi-proximity spaces to establish a formal relationship between the `topological' concepts of digital image processing and their continuous counterparts in Rn. Especially interesting are continuous functions in semi- proximity spaces. The definition of a `nearly' bicontinuous function is given which does not require the function to be one-to-one. A nearly bicontinuous function preserves connectedness in both directions. Therefore, nearly bicontinuous functions can be used for characterizing well-behaved operations on digital images such as thinning. Further, it is shown that the deletion of a simple point can be treated as a nearly bicontinuous function. These properties and the fact that a variety of nearness relations can be defined on digital pictures indicate that nearly continuous functions are a useful tool in the difficult task of shape description.
In this paper we present conditions which guarantee that every digitization process preserves important topological and differential geometric properties. These conditions also allow us to determine the correct digitization resolution for a given class of real objects. Knowing that these properties are invariant under digitization, we can then use them in feature-based recognition. Moreover, these conditions imply that only a few digital patterns can occur as neighborhoods of boundary points in the digitization. This is very useful for noise detection, since if the neighborhood of a boundary point does not match one of these patterns, it must be due to noise. Our definition of a digitization approximates many real digitization processes. The digitization process is modeled as a mapping from continuous sets representing real objects to discrete sets represented as digital images. We show that an object A and the digitization of A are homotopy equivalent. This, for example, implies that the digitization of A preserves connectivity of the object and its complement. Moreover, we show that the digitization of A will not change the qualitative differential geometric properties of the boundary of A, i.e. a boundary point which is locally convex cannot be digitized to a locally concave pixel and a boundary point which is locally concave cannot be digitized to a locally convex pixel.
Thinning is a process which erodes an object layer by layer until only its skeleton is left. A thinning algorithm should preserve connectivity, i.e., any object and its skeleton should maintain the same connectivity structure. In this paper, we propose sufficient conditions so that any 3D 6-subiteration parallel thinning algorithm satisfying these conditions is guaranteed to preserve connectivity.
A new approach for parabolic curve detection is presented based on Hough transforms. This approach uses the vertices of the curve as the detecting parameters which also indicate the position of the maximum curvature. Evidence for the transformation is gathered from the edge gradient obtained from an edge enhancement operator. The Sobel operator is used for edge enhancement. For the detection of parabolic curves in any orientation, a coordinate transformation matrix is used to derive a new parabolic equation which involves the edge gradient information. In the proposed algorithm, parabolic curves in any orientation are detected by using a 3D accumulator array. The new algorithm, therefore reduces the accumulator size from 4D accumulator array if an ordinary Hough transform for parabola detection is used. The reduced parameter space of the proposed algorithm improves the processing time and decreases storage requirement for the detection of parabolic curves. The paper reports on the accuracy obtained when this approach is used on natural and synthetic images containing parabolic and other curves.
As was noted early in the history of computer vision, using the same adjacency relation for the entire digital picture leads to so-called `paradoxes' related to the Jordan Curve Theorem. The most popular idea to avoid these paradoxes in binary images was using different adjacency relations for the foreground and the background: 8-adjacency for black points and 4-adjacency for white points, or vice versa. This idea cannot be extended in a straightforward way to multicolor pictures. In this paper a solution is presented which guarantees avoidance of the connectivity paradoxes related to the Jordan Curve Theorem for all multicolor pictures. Only one connectedness relation is used for the entire digital picture, i.e., for every component of every color. The idea is not to allow a certain `critical configuration' which can be detected locally to occur in digital pictures; such pictures are called `well-composed.' Well-composed pictures have very nice topological properties. For example, the Jordan Curve Theorem holds and the Euler characteristic is locally computable. This implies that properties of algorithms used in computer vision can be stated and proved in a clear way, and that the algorithms themselves become simpler and faster. Moreover, if a digitization process is guaranteed to preserve topology, then the obtained digital pictures must be well-composed.
Investigation of the topological nature of digitized knot structures is necessary for laying the foundations of a theory for dealing with the processing of such images. Images containing knots arise, for example, in the electron microscopy of DNA and RNA. The classical theory of knots is well developed, and one important result is that two knots are equivalent if their planar representations can be deformed into each other by a sequence of knot moves. We show here that digital knots, suitably defined, can be considered equal if they can be deformed into each other by a sequence of digital moves.
The straight-line Hough Transform using normal parameterization with a continuous voting kernel is considered. It transforms the colinearity detection problem to a problem of finding the global maximum of a two dimensional function above a domain in the parameter space. The principle is similar to robust regression using fixed scale M-estimation. Unlike standard M-estimation procedures the Hough Transform does not rely on a good initial estimate of the line parameters: The global optimization problem is approached by exhaustive search on a grid that is usually as fine as computationally feasible. The global maximum of a general function above a bounded domain cannot be found by a finite number of function evaluations. Only if sufficient a-priori knowledge about the smoothness of the objective function is available, convergence to the global maximum can be guaranteed. The extraction of a-priori information and its efficient use are the main challenges in real global optimization problems. The global optimization problem in the Hough Transform is essentially how fine should the parameter space quantization be in order not to miss the true maximum. More than thirty years after Hough patented the basic algorithm, the problem is still essentially open. In this paper an attempt is made to identify a-priori information on the smoothness of the objective (Hough) function and to introduce sufficient conditions for the convergence of the Hough Transform to the global maximum. An image model with several application dependent parameters is defined. Edge point location errors as well as background noise are accounted for. Minimal parameter space quantization intervals that guarantee convergence are obtained. Focusing policies for multi-resolution Hough algorithms are developed. Theoretical support for bottom- up processing is provided. Due to the randomness of errors and noise, convergence guarantees are probabilistic.
A geometric model for object tracking with a camera located on a two degrees of freedom (pan-tilt) robotic head is presented. Pan-tilt angles are computed from the projections of the moving target on the image plane. First it is shown that this is an open problem. An approximated geometric model is proposed to solve it. A study of the approximation error, and the results of its implementation in a commercial robotic head are also described.
A three-dimensional parallel thinning algorithm is presented. This algorithm works in cubic grid with the 26-connectivity. It is based upon two topological numbers introduced elsewhere. These numbers allow us to check if a point is simple or not and to detect end points. The strategy which is used for removing points in parallel without altering the topology of the image is a strategy based upon subfields: the cubic grid is divided into 8 subfields which are successively activated. The use of 4 subfields is also considered. One major interest of the subfield approach is that it is `complete,' i.e., all simple points which are not considered as skeletal points are removed. The proposed algorithm allows us to get a curve skeleton, a surface skeleton as well as a topological kernel of the objects. Furthermore it is possible to implement it by using only Boolean conditions.
The medial axis transform (MAT) of a shape, better known as its skeleton, is frequently used in shape analysis and related areas. In this paper a new approach for determining the skeleton of an object, is presented. The boundary is segmented at points of maximal positive curvature and a distance map from each of the segments is calculated. The skeleton is then located by applying simple rules to the zero sets of distance maps differences. A framework is proposed for numerical approximation of distance maps that is consistent with the continuous case, hence does not suffer from digitization bias due to metrication errors of the implementation on the grid. Subpixel accuracy in distance map calculation is obtained by using gray level information along the boundary of the shape in the numerical scheme. The accuracy of the resulting efficient skeletonization algorithm is demonstrated by several examples.
A new geometric formulation is given for the problem of determining position and orientation of a satellite scanner from error-prone ground control point observations in linear pushbroom imagery. The pushbroom satellite resection problem is significantly more complicated than that of the conventional frame camera because of irregular platform motion throughout the image capture period. Enough ephemeris data are typically available to reconstruct satellite trajectory and hence the interior orientation of the pushbroom imagery. The new approach to resection relies on the use of reconstructed scanner interior orientation to determine the relative orientations of a bundle of image rays. The absolute position and orientation which allows this bundle to minimize its distance from a corresponding set of ground control points may then be found. The interior orientation is represented as a kinematic chain of screw motions, implemented as dual-number quaternions. The motor algebra is used in the analysis since it provides a means of line, point, and motion manipulation. Its moment operator provides a metric of distance between the image ray and the ground control point.
Measuring similarity between polygonal shapes is an important problem in pattern recognition and machine intelligence. Often, the recognition is done by attracting a set of features that represent an unknown object and comparing this set with sets of features extracted from objects of known identification (prototypes). The prototype that is closest to the unknown object, when measured in a suitable metric, is assigned as the label for the unknown object. We consider the problem of comparing polygonal shapes based on their `annular profiles.' We show that the annular profile of a polygon Q of n sides induced by a given placement of an annular panel of size k can be computed in O(log k + W) time when Q is simple and in O(n log k + W) time when Q has holes. We show that profile query problem (PQP) can be answered in O(log n) time, given O(n) space and O(n2) pre-processing time. By relating the Voronoi diagrams of the edges and vertices of Q, we prove that all non-simple panels of Q can be computed in (Omega) (n5) time. We conclude by discussing the problem of constructing polygon from its annular profiles.
Given the number and variety of methods offered for hand printed character recognition, it may very well be that there is no single method that can be called the `best.' Most algorithms either use a collection of properties of the set of pixels belonging to the object, or attempt to measure structural properties of higher level features (e.g., lines) which are composed of object pixels. Unfortunately the degree of variation seen in handprinted characters makes feature properties hard to measure accurately, and complicates the extraction of structural features. What is suggested here is the use of the background areas remaining after the character itself is removed.
Shape decomposition is mainly motivated by structural shape description methods. Given a complex shape it is possible to decompose it into simpler sub-parts, that are well described by scalar global features, and then use the sub-parts in order to compose a structural description of the shape. This paper presents a shape decomposition method that performs decomposition of a polygonal approximation of the shape, into nearly convex sub-parts which are possibly overlapping, by locating structures in a fuzzy relation matrix. The fuzzy relation that is used to construct the fuzzy relation matrix, is defined on the set of the polygon vertices by a membership function that has a maximal value when the line connecting two vertices is contained completely within the polygon, and decreases as the deviation of this line from the polygon increases.
Finding the shortest path between points on a surface is a challenging global optimization problem. It is difficult to devise an algorithm that is computationally efficient, locally accurate and guarantees to converge to the globally shortest path. In this paper a two stage coarse to the fine approach of finding shortest paths is suggested. In the first stage a fast algorithm is used to obtain an approximation to the globally shortest path. In the second stage the approximation is refined into a locally optimal path. In the first stage we use the fast algorithm introduced by Kiryati and Szekely that combines a 3-D length estimator with graph search. This path is then refined to a shorter geodesic curve by an algorithm that deforms an arbitrary initial curve ending at two given surface points via geodesic curvature shortening flow. The 3D curve shortening flow is transformed into an equivalent 2D one that is implemented using an efficient numerical algorithm for curve evolution with fixed end points, introduced by Kimmel and Sapiro.
By considering all 2D transformations on image patterns, Otsu has shown that the moment feature is a kind of linear feature extractor from patterns and they are closed under such transformations. But the projections from 3D space onto 2D space are not considered. In order to recognize 3D motions from 2D images, the projections should be considered. So in this paper, the representation of the projected motion group is considered through Lie algebraic methods. We give the explicit formula of the bases of its representation. It is shown that the bases can be orthonormal bases. The quasi moments are defined and also shown to be closed in their order under the projected rotation group.
The estimation of the length of a continuous three dimensional curve from its digital image is considered. This requires a definition of the digitization process used for converting the continuous curve to the discrete representation. The two dimensional case has been extensively studied in the literature. The few available estimators for the 3-D case are based on 26- directional chain code representation of the digital curve. That representation provides natural classification of the chain code links which is necessary for accurate length estimation. Three- dimensional curve quantization methods are first considered. Desirable properties of curve representation schemes are identified and quantitative comparison of the various methods is carried out. It is shown that grid intersect quantization and other discretization schemes that lead to 26-directional chain code representations are inferior methods for curve quantization in 3-D and that cube quantization, leading to 6-directional chain codes, should be preferred. Accurate length estimation based on cube quantization has not been so far attempted due to the lack of obvious link classification criteria. In this paper simple but powerful link classification criteria for 6-directional digital curves are suggested. They are used to obtain unbiased length estimators, with rms errors as low as 0.57% for equally distributed straight lines, about five times better than in previous estimators that are based on 26-directional chain code representations.
Ridges are generalizations of local maxima for smooth functions of n independent variables. At a ridge point x the function has a local maximum when restricted to an affine (n - d)- dimensional plane located at x, the plane varying with x. The set of ridge points generically lie on d-dimensional manifolds. The ridge definition is extremely flexible since the dimension d and the affine planes can be chosen to suit an application's needs. Fast algorithms for constructing 1-dimensional ridges in n-dimensional images are presented in this paper. The algorithms require an initial approximation to a ridge point, which can be supplied interactively or via a model of a previously analyzed image. Similar algorithms can be implemented for higher dimensional ridges.
The use of correlation as a measure of similarity between two signals is a well known technique. Correlation is commonly used in stereo vision to solve the correspondence problem. In this context the aim is to find the position of a point in one of the two images of a weakly calibrated stereo pair which corresponds to a point in the other image. Template windows are chosen from the first image as 2D information samples to be matched in a region of interest of the second image. We propose a technique which exploits an adaptive template matching. It works as an iterative optimization of the involved parameters with a consequent iterative refinement of the correspondence. Matching is performed by transforming a template window using the current image-to-image affine transformation and exploiting the epipolar geometry to reduce the degrees of freedom of the template position in the region of interest. As a validation test of the presented technique we report a study on the accuracy of the obtained results with various template sizes. We show that subpixel accuracy can be reached with relatively small templates. A major concern of our work is the accuracy of the correspondence. Higher accuracy of the correspondences results in fact in a more realistic reconstruction of the viewed scene. In the case of a stereo pair having undergone an estimated distortion of 0.5 pixel (10 micrometers of pixel size) it shows an accuracy in the correspondence of 3 micrometers for template windows of 17 X 17 size selected in textured image patches. Experiments are under process to improve the mentioned results.
Broadcasting with selective reduction (BSR) is a model of parallel computation introduced by Akl, Guenther, and Lindon. The model is more powerful than any of the PRAM models, yet one and two criteria BSR do not require more hardware (up to a constant factor) to implement than PRAM. Besides power, the model provides the ability to write very short solutions to a variety of problems. For many problems BSR offers constant time solutions which were not possible to achieve with any PRAM. In this paper we apply BSR to solve, in constant time and with optimal O(n) number of processors (where n is the input size), several visibility problems. The parallel and point visibility problems of a set of n parallel line segments or n iso-oriented rectangles or n disks in the plane are solved on a two criteria BSR. The problem of constructing the dominance graph of a set of n iso-oriented rectangles is solved using a three criteria BSR.
This paper describes an efficient technique for representing 2D shapes based on sections of spiral arcs. The representation is invariant with respect to changes of position, orientation, and size. The shape may be considered at different scales of resolution. At coarser scales, minor detail in the contour is removed and the shape is simplified. Simplification can be achieved using rewrite rules which make the representation more compact, and hence suitable to use as a basis for matching and recognition.
Image representation schemes are of importance in fields of computer vision, graphics, image processing, CAD/CAM, etc. Various representation schemes are presented in the literature for both 2D and 3D. In this paper, we present a scheme of representation using the concept of octagonal distances. They are called Medial circle representation (MCR) and Medial sphere representation (MSR) in 2-D and 3-D, respectively. Storage requirements, computational complexity, merits and demerits of the representation schemes are discussed.
We consider the problem of identifying one of a set of polygonal models in the plane using point probes and finger probes. In particular, we give strategies for using a minimum number of finger probes to determine a finite number of possible locations of an unknown interior point p in one of the models. A finger probe takes as input an interior point p of a polygon P and a direction (Theta) , and it outputs the first point of intersection of a ray emanating from p in direction (Theta) with the boundary of P. We show that without a priori knowledge of what the models look like, no finite number of finger probes will suffice to localize the point p. When the models are given in advance, we give both batch and dynamic probing strategies for solving the problem. We consider both the case where the models are aligned rectilinear polygons and the case where the models are simple polygons.
The concept of noisy straight line introduced by Melter and Rosenfeld is generalized and applied for digital parabolas. It is proved that digital parabola segments and their least square parabola fits are in one-to-one correspondence. This enables a (first known) vector space representation of a digital parabola segment. One of such representations is (x1, n, a, b, c) where x1 and n are the x-coordinate of the left endpoint and the number of digital points, respectively, while a, b, and c are the coefficients of the least square parabola fit Y equals aX2 + bX + c for the given parabola segment.
Consider a binary image pretiled, one pixel per processor, onto mesh of size (root)n X (root)n, enhanced with row and column buses. Our first contribution is to show an (Omega) (n1/6) time lower bound for the problem of deciding whether the image contains at least one black pixel. We then obtain time lower bounds for many other digital geometry problems by reducing this fundamental problem to all the other problems of interest. Specifically, the problems that we address are: detecting whether an image contains at least one black pixel, computing the convex hull of the image, computing the diameter of an image, deciding whether a set of digital points is a digital line, computing the minimum distance between two images, deciding whether two images are linearly separable, computing the perimeter, area and width of a given image. Our second contribution is to show that the time lower bounds obtained are tight by exhibiting simple O(n1/6) time algorithms for these problems. An interesting feature of these algorithms is that they use, directly or indirectly, an algorithm for the leftmost one problem recently developed by one of the authors.