Cancer recognition is the prerequisite to determine appropriate treatment. This paper focuses on the semantic segmentation task of microvascular morphological types on narrowband images to aid clinical examination of esophageal cancer. The most challenge for semantic segmentation is incomplete-labeling. Our key insight is to build fully convolutional networks (FCNs) with double-label to make pixel-wise predictions. The roi-label indicating ROIs (region of interest) is introduced as extra constraint to guild feature learning. Trained end-to-end, the FCN model with two target jointly optimizes both segmentation of sem-label (semantic label) and segmentation of roi-label within the framework of self-transfer learning based on multi-task learning theory. The learning representation ability of shared convolutional networks for sem-label is improved with support of roi-label via achieving a better understanding of information outside the ROIs. Our best FCN model gives satisfactory segmentation result with mean IU up to 77.8% (pixel accuracy > 90%). The results show that the proposed approach is able to assist clinical diagnosis to a certain extent.
A CAPTCHA (“Completely Automated Public Turing test to tell Computers and Human Apart”) system is a program that most humans can pass but current computer programs could hardly pass. As the most common type of CAPTCHAs , text-based CAPTCHA has been widely used in different websites to defense network bots. In order to breaking textbased CAPTCHA, in this paper, two trained CNN models are connected for the segmentation and classification of CAPTCHA images. Then base on these two models, we apply sliding window segmentation and voting classification methods realize an end-to-end CAPTCHA breaking system with high success rate. The experiment results show that our method is robust and effective in breaking text-based CAPTCHA with noise.
We propose a novel multiscale sparse representation approach for SAR target classification. It firstly extracts the dense SIFT descriptors on multiple scales, then trains a global multiscale dictionary by sparse coding algorithm. After obtaining the sparse representation, the method applies spatial pyramid matching (SPM) and max pooling to summarize the features for each image. The proposed method can provide more information and descriptive ability than single-scale ones. Moreover, it costs less extra computation than existing multiscale methods which compute a dictionary for each scale. The MSTAR database and ship database collected from TerraSAR-X images are used in classification setup. Results show that the best overall classification rate of the proposed approach can achieve 98.83% on the MSTAR database and 92.67% on the TerraSAR-X ship database.
Image completion solves the problem of filling missing region by using the information from the same or another
image. It is difficult to maintain a balance between visual plausibility and efficiency among the existing algorithms. In
this paper, we first propose a novel graph-based approach combining patch offsets and structure feature to get more
coherent completion result. We further put forward creatively using a few dominate offsets with an adaptive mechanism
of labels and formulate image completion to be a graph-cut optimization problem. Experiments on a wide variety of
images show our method yields better results in various challenging cases than state-of-art methods both on visual
impact and efficiency.
The application of image morphing to computer animation and computer graphics is experiencing broad growth. It has proven to be a powerful visual effects tool in film and television, depicting the fluid transformation of one digital image into another. In this paper, the authors present a new method for image morphing based on field morphing and mesh warping. Our method makes use of the feature specification method of field morphing which is simple and expressive and the warp generation approach of mesh warping which is straightforward and fast. Some measures are taken to make the proposed method work; experimental results show that the proposed method facilitates the input of features and calculates quickly with a steady metamorphosis effect.
Decorrelation is the most important step in lossless compression. There are spatial and spectral redundancies in multispectral images. In this paper, we proposed a new technique that integer wavelet transform is used to remove spatial redundancy and non-linear predictors are used to remove spectral redundancy. For computational reason, the number of spectral predictors is discussed. These techniques result in higher compression ratios.