The performance of image classification largely depends on both the discrimination power of the visual features
for image content representation and the effectiveness of the kernels for diverse image similarity characterization.
Different types of kernels have been developed for SVM image classifier training, and different research teams may
use different types of visual features in their experiments. Thus there is an urgent need to provide benchmark work
to assess the real performance of different types of visual features and kernels for various image classification
tasks. In this paper, we have benchmarked multiple approaches for feature extraction and image similarity
characterization, so that some useful guidelines can be provided for: (a) how to select more effective approach
for feature extraction and enhance the discrimination power of various types of visual features; and (b) how to
combine multiple types of visual features and their kernels to enhance the discrimination power of SVM image
classifiers. Our experiments on large-scale image collections have also obtained very positive results.
To crawl large amounts of weakly-tagged images for computer vision tasks such as object detection and scene
recognition, it is very important to develop new techniques for tag cleansing and word sense disambiguation
(i.e., removing irrelevant images from the crawled results). Based on this observation, a topic network is first
generated to characterize both the semantic similarity contexts and the visual similarity contexts between the
image topics more sufficiently. The topic network is used to represent the classes of objects and scenes of interest.
Second, both the visual similarity contexts between the images and the semantic similarity contexts between
their tags are integrated for tag cleansing and word sense disambiguation. By addressing the issues of polysemes
and synonyms more effectively, our word sense disambiguation algorithm can determine the relevance between
the images and the associated tags more precisely, and thus it can allow us to crawl large-scale weakly-tagged
images for computer vision tasks.
Automatic image segmentation is a fundamental and challenging work in image analysis. We present a stochastic contour approach that draws the contour by multiple agents stochastically, each driven by a simple policy. A contour confidence map is formed, and the image is partitioned hierarchically according to the probability of being surrounded by an average contour. The segmentation is formed by truncating the hierarchical tree based on the dissimilarity increment. The average contour formed in the stochastic contour approach no longer depends on the initial conditions and tolerates less guaranteed convergence. The stochastic contour evolution provides perturbation to jump out of local minima, while the average contour handles model uncertainty naturally. No training process is involved in this approach. The experimental evaluation on a large amount of images with diverse visual properties has shown robustness and good performance of our technique.
Proc. SPIE. 6914, Medical Imaging 2008: Image Processing
KEYWORDS: Image processing algorithms and systems, Curium, Tissues, Image segmentation, Medical imaging, Neuroimaging, Expectation maximization algorithms, 3D image processing, 3D magnetic resonance imaging, Brain
This paper presents a novel algorithm of 3D human brain tissue segmentation and classification in magnetic resonance image (MRI) based on region restricted EM algorithm (RREM).
The RREM is a level set segmentation method while the evolution of the contours was driven by the force field composed by the probability density functions of the Gaussian models.
Each tissue is modeled by one or more Gaussian models restricted by free shaped contour so that the Gaussian models are adaptive to the local intensities.
The RREM is guaranteed to be convergency and achieving the local minimum.
The segmentation avoids to be trapped in the local minimum by the split and merge operation.
A fuzzy rule based classifier finally groups the regions belonging to the same tissue and forms the segmented 3D image of white matter (WM) and gray matter (GM) which are of major interest in numerous applications.
The presented method can be extended to segment brain images with tumor or the images having part of the brain removed with the adjusted classifier.
To support more effective video retrieval at semantic level, we introduce a novel framework to achieve semantic video classification. This novel framework includes: (a) A semantic-senstive video content representation framework via principal video shots to enhance the quality of features (i.e., the ability of the selected low-level multimodal perceptual features to discriminate among various semantic video concepts); (b) A semantic video concept interpretation framework via flexible mixture model to bridge the semantic
gap between the semantic video concepts and the low-level multimodal perceptual features; (c) A novel concept learning technique to integrate unlabeled samples with labeled samples for more accurate classifier training. Experimental results on semantic medical video classification are also presented to evaluate the performance of the proposed framework.
To provide users with an overview of medical video content at various levels of abstraction which can be used for more efficient database browsing and access, a hierarchical video summarization strategy has been developed and is presented in this paper. To generate an overview, the key frames of a video are preprocessed to extract special frames (black frames, slides, clip art, sketch drawings) and special regions (faces, skin or blood-red areas). A shot grouping method is then applied to merge the spatially or temporally related shots into groups. The visual features and knowledge from the video shots are integrated to assign the groups into predefined semantic categories. Based on the video groups and their semantic categories, video summaries for different levels are constructed by group merging, hierarchical group clustering and semantic category selection. Based on this strategy, a user can select the layer of the summary to access. The higher the layer, the more concise the video summary; the lower the layer, the greater the detail contained in the summary.
Seeded image growing (SRG) algorithm is very attractive for semantic image segmentation but it also suffer from the problems of pixel sorting orders for labeling and automatic seed selection. We design an automatic SRG algorithm, along with a boundary-oriented parallel pixel labeling technique and an automatic seed selection method. In order to support more efficient image access over large-scale database, we suggest a multi-level image database management structure. This framework also supports a concept-oriented image classification via a probabilistic approach. Hierarchical image indexing and summarization are also discussed.
This paper proposes an integrated system for supporting content-based video retrieval and browsing over networks. An automatic semantic video object extraction technique for providing more compact video representation is developed. The video images are first partitioned into a ste of homogeneous regions with accurate boundaries by integrating the result of color edge detection and region growing procedures. The object seeds, which are the intuitive and representative part of the semantic objects, are detected from these obtained homogeneous image regions. The semantic objects are then generated by a seeded region aggregation or a human interaction procedure. These obtained semantic objects are tracked along the time axis for exploiting their temporal correspondences among frames. Given the semantic video objects represented by a set of visual features, a seeded semantic video content clustering technique is developed for providing more effective video indexing, retrieval and browsing.
In this paper, a novel object-oriented hierarchical image and video segmentation algorithm is proposed based on 2D entropic thresholding, where the local variance contrast is selected for generating the 2D entropic surface because this parameter can indicate the strength of the edge accurately. The extracted object is first represented by a group of (4 X 4) blocks coarsely, then the intra-block edge extraction procedure and the joint spatiotemporal similarity test among neighboring blocks are further performed for determining the meaningful real objects. Experimental results have confirmed that the proposed hierarchical algorithm may be very useful for MPEG-4 applications, such as determining the Video Object Plane Formation automatically and selecting the coding pattern adaptively. A novel fast algorithm is also introduced for reducing the search burden. Moreover, this unsupervised algorithm also makes the automatic image and video segmentation possible.
This paper reports our study of a novel image sequence coding scheme based on 3D image segmentation. The intervals between successive reference frames are adaptively adjusted to increase the compensability of image sequences. The bit rate allocation takes the advantage of the temporal masking of human vision system.