For objects on a plane, a "scale factor" relates the physical dimensions of the objects to the corresponding dimensions in a camera image. This scale factor may be the only calibration parameter of importance in many test applications. The scale factor depends on the angular size of a pixel of the camera, and also on the range to the object plane. A measurement procedure is presented for the determination of scale factor to high precision, based on the translation of a large-area target by a precision translator. A correlation analysis of the images of a translated target against a reference image is used to extract image shifts and the scale factor. The precision of the measurement is limited by the translator accuracy, camera noise and various other secondary factors. This measurement depends on the target being translated in a plane perpendicular to the optic axis of the camera, so that the scale factor is constant during the translation. The method can be extended to inward-looking 3D camera networks and can, under suitable constraints, yield both scale factor and transcription angle.
Visual information is of vital significance to both animals and artificial systems. The majority of mammals rely on two images, each with a resolution of 10<sup>7</sup>-10<sup>8</sup> 'pixels' per image. At the other extreme are insect eyes where the field of view is segmented into 10<sup>3</sup>-10<sup>5</sup> images, each comprising effectively one pixel/image. The great majority of artificial imaging systems lie nearer to the mammalian characteristics in this parameter space, although electronic compound eyes have been developed in this laboratory and elsewhere. If the definition of a vision system is expanded to include networks or swarms of sensor elements, then schools of fish, flocks of birds and ant or termite colonies occupy a region where the number of images and the pixels/image may be comparable. A useful system might then have 10<sup>5</sup> imagers, each with about 10<sup>4</sup>-10<sup>5</sup> pixels. Artificial analogs to these situations include sensor webs, smart dust and co-ordinated robot clusters. As an extreme example, we might consider the collective vision system represented by the imminent existence of ~10<sup>9</sup> cellular telephones, each with a one-megapixel camera. Unoccupied regions in this resolution-segmentation parameter space suggest opportunities for innovative artificial sensor network systems. Essential for the full exploitation of these opportunities is the availability of custom CMOS image sensor chips whose characteristics can be tailored to the application. Key attributes of such a chip set might include integrated image processing and control, low cost, and low power. This paper compares selected experimentally determined system specifications for an inward-looking array of 12 cameras with the aid of a camera-network model developed to explore the tradeoff between camera resolution and the number of cameras.
When a bright light source is viewed through Night Vision Goggles (NVG), the image of the source can appear enveloped in a “halo” that is much larger than the “weak-signal” point spread function of the NVG. The halo phenomenon was investigated in order to produce an accurate model of NVG performance for use in psychophysical experiments. Halos were created and measured under controlled laboratory conditions using representative Generation III NVGs. To quantitatively measure halo characteristics, the NVG eyepiece was replaced by a CMOS imager. Halo size and intensity were determined from camera images as functions of point-source intensity and ambient scene illumination. Halo images were captured over a wide range of source radiances (7 orders of magnitude) and then processed with standard analysis tools to yield spot characteristics. The spot characteristics were analyzed to verify our proposed parametric model of NVG halo event formation. The model considered the potential effects of many subsystems of the NVG in the generation of halo: objective lens, photocathode, image intensifier, fluorescent screen and image guide. A description of the halo effects and the model parameters are contained in this work, along with a qualitative rationale for some of the parameter choices.
The capture of a wide field of view (FOV) scene by dividing it into multiple sub-images is a technique with many precedents in the natural world, the most familiar being the compound eyes of insects and arthropods. Artificial structures of networked cameras and simple compound eyes have been constructed for applications in robotics and machine vision. Previous work in this laboratory has explored the construction and calibration of sensors which produce multiple small images (of ~150 pixels in diameter) for high-speed object tracking.
In this paper design options are presented for electronic compound eyes consisting of 10<sup>1</sup> - 10<sup>3</sup> identical 'eyelets'. To implement a compound eye, multiple sub-images can be captured by distributing cameras and/or image collection optics. Figures of merit for comparisons will be developed to illustrate the impact of design choices on the field of view, resolution, information rate, image processing, calibration, environmental sensitivity and compatibility with integrated CMOS imagers.
Whereas compound eyes in nature are outward-looking, the methodology and subsystems for an outward-looking compound-eye sensor are similar for in an inward-looking sensor, although inward-looking sensors have a common region viewable to all eyelets simultaneously. The paper addresses the design considerations for compound eyes in both outward-looking and inward-looking configurations.