In the design of a computer vision system which is to analyze two-dimensional images of real-world scenes, it is imperative that we understand the principles underlying the image projections of upright objects (obstacles or landmarks to be avoided or identified) and flat-lying objects (cast shadows, texture change, etc.). This understanding is particularly important in the automated guidance of roving robots. To this end, this study begins with a presentation of a modular structure for the interpretation of real-world scenes, identifying pertinent problems for the safe and enhanced guidance of roving robots. This is followed by a mathematical framework developed to bring about enhanced scene interpretation. This mathematical framework includes the derivation of the geometrical formulae relating the two-dimensional image to the three-dimensional real world, and an analysis of the perspective effect present in the two-dimensional image. With this mathematical basis established, image techniques are developed to assess the image projections of the objects in question, and to exploit the fact that upright objects, within the scope of the stated problem, are not affected by the perspective effect. Finally, in an attempt to recover the depth information, the proposed techniques are complemented by an algorithm designed to measure the disparity that exists in stereo images.