Natural scene classification is a challenging open problem in computer vision. We present a novel spatial pyramid representation scheme for recognizing scene category. Initially, each image is partitioned into sub-blocks, applying the technology of superpixel lattices segmentation according to a boosted edge learning boundary map, which makes the objects in each sub-block have the integrity-that is, the features in each sub-block are relatively consistent. Then, we extract the dense scale-invariant feature transform features of the images and form the contextual visual feature description. Finally, the image representations are performed by following the methodology of spatial pyramid. The feature descriptions we present include both local structural information and global spatial structural information; therefore, they are more discriminative for scene classification. Experiments demonstrate that the classification rate can achieve about 87.13% on a set of 15 categories of complex scenes.
Matching points between two or multiple images of a scene is a vital component in many computer vision and pattern recognition tasks. The key step of point matching is how to construct a distinctive and robust descriptor. A state-of-the-art scale-invariant feature transform (SIFT) descriptor has proven that it outperforms other local descriptors on the distinctiveness and robustness. However, the SIFT descriptor neglects the global context of the feature points, as thus it fails to resolve the ambiguities that occur in local similar regions in an image. In this paper, a spatial distribution (SD) descriptor is constructed for each feature point detected by the SIFT method. It uses a log-polar histogram to build the global component according to the difference-of-Gaussian convolution image information. The spatial distribution descriptor has rotation, zoom invariance and partial skewness invariance due to that it integrates the local and global information of feature points. Points matching are performed on various images by the proposed framework. Experimental results show that the SD method outperforms the method using only SIFT.