Images recorded by digital cameras are invariably distorted by errors of CCD sensors and a series of camera operations in the imaging process. The distortion sources include noise, geometrical distortion, gamma correction, intensity and chromatic bias, blurring, etc. However, the true signal of the incident light needs to be known in many situations. In addition to traditional visual computing tasks such as shape from shading, color constancy, and photometric stereo, acquiring a large natural image database in which each image has a high-dynamic range and is carefully calibrated to reflect the true signal of the incident light is essential to human vision research. We recently acquired such a database containing 1600 images using an Olympus C2040 digital camera and explained a series of color perceptual phenomena based on the statistics of these images. We introduced in this paper the techniques we used for calibrating and acquiring this database, which include the methods to correct the spatial falloff, non-linearity, spectral bias, blurring, and noises and to obtain high-dynamic range for each image. The techniques presented here can be used in acquiring similar databases for a wide range of human vision and computer vision research fields.
Image classification can facilitate semantic retrieval and browsing of large-scale image databases. Existing approaches are usually based on extracting local or global low-level features such as color, edge, and texture from images. In this paper, we propose an image categorization method that characterizes the respective scene structures in images. 2D Spatial Frequency Map of an image, as well as the respective projection vector representations and principal component representations, are used to characterize the spatial structure of the image. Based on multiple similarity scores, we use a spectral clustering method and a maximal-spanning-tree-spectral-clustering method to generate image categories.
Automatic understanding of document images is a hard problem. Here we consider a sub-problem, automatically extracting content from filled form images. Without pre-selected templates or sophisticated structural/semantic analysis, we propose a novel approach based on clustering the component-block-projection-vectors. By combining spectral clustering and minimal spanning tree clustering, we generate highly accurate clusters, from which the adaptive templates are constructed to extract the filled-in content. Our experiments show this approach is effective for a set of 1040 US IRS tax form images belonging to 208 types.