Perceptual features, for example direction, contrast and repetitiveness, are important visual factors for human to perceive a texture. However, it needs to perform psychophysical experiment to quantify these perceptual features’ scale, which requires a large amount of human labor and time. This paper focuses on the task of obtaining perceptual features’ scale of textures by small number of textures with perceptual scales through a rating psychophysical experiment (what we call labeled textures) and a mass of unlabeled textures. This is the scenario that the semi-supervised learning is naturally suitable for. This is meaningful for texture perception research, and really helpful for the perceptual texture database expansion. A graph-based semi-supervised learning method called random multi-graphs, RMG for short, is proposed to deal with this task. We evaluate different kinds of features including LBP, Gabor, and a kind of unsupervised deep features extracted by a PCA-based deep network. The experimental results show that our method can achieve satisfactory effects no matter what kind of texture features are used.
Surface height map estimation is an important task in high-resolution 3D reconstruction. This task differs from general scene depth estimation in the fact that surface height maps contain more high frequency information or fine details. Existing methods based on radar or other equipments can be used for large-scale scene depth recovery, but might fail in small-scale surface height map estimation. Although some methods are available for surface height reconstruction based on multiple images, e.g. photometric stereo, height map estimation directly from a single image is still a challenging issue. In this paper, we present a novel method based on convolutional neural networks (CNNs) for estimating the height map from a single image, without any equipments or extra prior knowledge of the image contents. Experimental results based on procedural and real texture datasets show the proposed algorithm is effective and reliable.
Since handwritten text lines are generally skewed and not obviously separated, text line segmentation of handwritten document images is still a challenging problem. In this paper, we propose a novel text line segmentation algorithm based on the spectral clustering. Given a handwritten document image, we convert it to a binary image first, and then compute the adjacent matrix of the pixel points. We apply spectral clustering on this similarity metric and use the orthogonal kmeans clustering algorithm to group the text lines. Experiments on Chinese handwritten documents database (HIT-MW) demonstrate the effectiveness of the proposed method.
In this paper, we propose two new problems related to classification of photographed document images, and based on deep learning methods, present the baseline solutions for these two problems. The first problem is that, for some photographed document images, which book do they belong to? The second one is, for some photographed document images, what is the type of the book they belong to? To address these two problems, we apply “AexNet” to the collected document images. Using the pre-trained “AlexNet” on the ImageNet data set directly, we obtain 92.57% accuracy for the book-name classification and 93.33% accuracy for the book-type one. After fine-tuning on the training set of the photographed document images, the accuracy of the book-name classification increases to 95.54% and that of the booktype one to 95.42%. To our best knowledge, although there exist many image classification algorithm, no previous work has targeted to these two challenging problems. In addition, the experiments demonstrate that deep-learning features outperform features extracted with traditional image descriptors on these two problems.