In the past few years, Generative Adversarial Network (GAN) became a prevalent research topic. GAN has ability to generate good quality images that look like natural images from a random vector. In this paper, we follow the basic idea of GAN and propose a novel model for image saliency detection, which is called Supervised Adversarial Networks (SAN). However, different from GAN, the proposed method uses fully supervised learning to learn both G-Network and D-Network by applying class labels of the training set. Moreover, a novel kind of layer call conv-comparison layer is introduced into the D-Network to further improve the saliency performance. Experimental results on Pascal VOC 2012 database show that the SAN model can generate high quality saliency maps for many complicate natural images.
Image set annotation is an important task in the supervised training of the deep neural network. Manual and data-driven dataset annotation methods are commonly used approaches. Both of them have shortcomings, especially in the case of the dataset requiring professional knowledge, which leads to high cost with manual annotation methods and poorly diversified annotation samples with data-driven annotation methods. Although the recommendation annotation method based on cosine similarity using deep neural network features takes advantages of manual annotation and data-driven method, there are still problems such as low accuracy and click-through rate. In order to improve the recommendation accuracy and click-through rate, we propose a confusion graph recommendation annotation method, which builds a confusion graph based on the Largest Margin Nearest Neighbor (LMNN) distance among deep neural network features, to recommend the most confusing images to annotators. In this paper, we made ablation studies on the self-built child face dataset in terms of Precision, mAP (mean Average Precision), and CTR (click-through-rate). The experimental results show that the proposed method achieves superior performance, compared with the cosine similarity recommendation annotation method and the manual annotation method.
Image annotation is a task of assigning semantic labels to an image. Recently, deep neural networks with visual attention have been utilized successfully in many computer vision tasks. In this paper, we show that conventional attention mechanism is easily misled by the salient class, i.e., the attended region always contains part of the image area describing the content of salient class at different attention iterations. To this end, we propose a novel attention shaping mechanism, which aims to maximize the non-overlapping area between consecutive attention processes by taking into account the history of previous attention vectors. Several weighting polices are studied to utilize the history information in different manners. In two benchmark datasets, i.e., PASCAL VOC2012 and MIRFlickr-25k, the average precision is improved by up to 10% in comparison with the state-of-the-art annotation methods.