In this paper, we propose a contrast stretching which reflects the visibility of human visual system (HVS) as well as the
histogram of a target image. In the proposed method, a target image is contrast-stretched by the auto gain/offset with
visibility-based clipping, where the clipping thresholds are decided to maximize the averaged visibility of the contraststretched
image. The visibility function is defined by a spatial JND (just noticeable difference), which is the threshold
below which any change of a given pixel from its textured neighbors is not perceived by the HVS. Experimental results
show that the proposed method efficiently stretch the contrast of target images to be more pleasing to the HVS than some
We propose a content-based image retrieval (CBIR) method based on an efficient combination of a color feature and multiresolution texture features. As a color feature, a HSV autocorrelogram is chosen which is known to measure spatial correlation of colors well. As texture features, BDIP and BVLC moments are chosen which is known to measure local intensity variations well and measure local texture smoothness well, respectively. The texture features are obtained in a wavelet pyramid of the luminance component of a color image. The extracted features are combined for efficient similarity computation by the normalization depending on their dimensions and standard deviation vectors. Experimental results show that the proposed method yielded average 10% better performance in precision vs. recall and average 0.12 in average normalized modified retrieval rank (ANMRR) than the methods using color autocorrelogram, BDIP and BVLC moments, and wavelet moments, respectively.
Proc. SPIE. 5296, Document Recognition and Retrieval XI
KEYWORDS: Optical filters, Cameras, Image processing, Digital filtering, Personal digital assistants, Image filtering, Image classification, Optical character recognition, Yield improvement, Binary data
In this paper, we propose a block adaptive binarization (BAB) using a modified quadratic filter (MQF) to binarize business card images of ill conditions acquired by personal digital assistant (PDA) cameras. In the proposed method, a business card image is first partitioned into blocks of 8×8 and the blocks are then classified into character blocks (CBs) and background blocks (BBs) for locally adaptive processing. Each CB is windowed with 24×24 rectangular window
centering around the CB and the windowed blocks are improved by the preprocessing filter MQF, in which the scheme of threshold selection in QF is modified. The 8×8 center block of the improved block is binarized with the threshold. A binary image is obtained tiling each binarized block in its original position. Experimental results show that the quality of binary images obtained by the proposed method is much better than that by the conventional global binarization (GB)
using QF. In addition, the proposed method yields about 43% improvement of character recognition rate over the GB using QF.
In this paper, we propose an effective boundary matching based error detection algorithm using causal neighbor blocks in H.263 coded video to improve video quality degraded from channel error. The proposed algorithm first calculates boundary mismatch powers between a current block and one of its causal neighbor blocks. It then decides that a current block should be normal if all the mismatch powers are less than an adaptive threshold, which is adaptively determined using the statistics of the two adjacent blocks. In some expeirments under the environment of 16 bits burst error at bit error rates (BERs) of 10-4~10-3, it is shown that the proposed algorithm yields the improvements of maximum 20% in error detection rate and of maximum 3.5 dB in PSNR of concealed frames, compared with Zeng's error detection algorithm.
This paper proposes a three-dimensional (3D) region-based segmentation algorithm for extracting a diagnostic tumor from ultrasound images by using a split-and-merge and seeded region growing with a distortion-based homogeneity cost. In the proposed algorithm, 2D cutting planes are first obtained by the equiangular revolution of a cross sectional plane on a reference axis for a 3D volume data. In each cutting plane, an elliptic seed mask that is included tightly in a tumor of interest is set. At the same time, each plane is finely segmented using the split-and-merge with a distortion-based cost. In the result segmented finely, all of the regions that are across or contained in the elliptic seed mask are then merged. The merged region is taken as a seed region for the seeded region growing. In the seeded region growing, the seed region is recursively merged with adjacent regions until a predefined condition is reached. Then, the contour of the final seed region is extracted as a contour of the tumor. Finally, a 3D volume of the tumor is rendered from the set of tumor contours obtained for the entire cutting planes. Experimental results for a 3D artificial volume data show that the proposed method yields maximum three times reduction in error rate over the Krivanek’s method. For a real 3D ultrasonic volume data, the error rates of the proposed method are shown to be lower than 17% when the results obtained manually are used as a reference data. It also is found that the contours of the tumor extracted by the proposed algorithm coincide closely with those estimated by human vision.
An efficient algorithm is proposed for interactive ultrasound image retrieval using magnitude frequency spectrum (MFS). The interactive retrieval is especially intended to be useful for training an intern to diagnose with ultrasound images. In the retrieval process, information on which are relevant to a query image among object images retrieved in the previous iteration is fed back by user interaction. In order to improve discrimination between a query image and each of object images in a database (DB) by using the MFS, which is powerful for ultrasound image retrieval, we incorporate feature vector normalization and root filtering in feature extraction. To effectively integrate the feedback information, we use a feedback scheme based on Rocchio equation, where the feature of a query image is replaced with the weighted average of the feature of a query image and those of object images. Experimental results for real ultrasound images show that while yielding a precision of about 75% at a recall of about 8% in the initial retrieval, the interactive procedure yields a great performance improvement, that is, a precision of about 95% in the third iteration.
In this paper, we propose an efficient algorithm for organ recognition in ultrasound images using log power spectrum. The main procedure of the algorithm consists of feature extraction and feature classification. In the feature extraction, as a translation invariant feature, log power spectrum is used for extracting the information on the echo of organ tissues from a preprocessed input image. In the feature classification, Mahalanobis distance is used as a measure of the similarity between the feature of an input image and the representative feature of each class. Experimental results for real ultrasound images show that the proposed algorithm yields the maximum 30% improvement of recognition rate over the recognition algorithm using power spectrum and Euclidean distance, and results in 10-40% improvement of recognition rate over the recognition algorithm using weighted quefrency complex cepstrum.
In this paper, we first propose new texture features, BDIP (block difference of inverse probabilities) and BVLC (block variation of local correlation coefficients), for content-based image retrieval (CBIR) and then present an image retrieval method based on the combination of BDIP and BVLC moments. BDIP uses the local probabilities in image blocks to measure the variation of brightness well. BVLC uses the variations of local correlation coefficients in images blocks to measure texture smoothness well. Experimental results show that the presented retrieval method yields about 12% better performance than the method using only BDIP or BVLC moments and about 10% better performance than the method using wavelet moments.
We propose an efficient method for content-based ultrasound image retrieval using magnitude frequency spectrum and implement an ultrasound image retrieval system based on the proposed method. The target images are ultrasound images of adult organs. A trained staff often acquires such images so that images of the same kind of organs are very similar, although their locations may not exactly coincide. Therefore, the magnitude frequency spectrum, which has a translation-invariant property, is used as a feature for content-based retrieval. A test image database is composed of real ultrasound images. As a retrieval result, a specified number of highly similar target images are retrieved from all the target images. If all the target images in the database are pre-classified into organs of the same kind, the retrieved images are selected among the images whose class is the same as that of the highest similarity image. Experimental results of the proposed method is superior to other methods. The proposed method especially yields further performance improvement by using the pre-classification. Moreover, it is found from the experimental results that the magnitude frequency spectrum method is robust to the speckle noise that usually exists in ultrasound images.
An efficient algorithm is presented for estimating interframe distances in a 2D frame sequence acquired by freehand scanning for the reconstruction of a 3D ultrasound image. Since interframe distances in a 2D frame sequence obtained with a hand-held ultrasound probe are not uniform, a 3D image directly reconstructed from such 2D data can substantially deviate from the real form of the human organs. Accordingly, to estimate the interframe distances in a 2D frame sequence, block-based lateral correlation functions are determined in each frame, plus it is also assumed that each block-based lateral correlation function is identical to the interframe correlation function of the block. Based on this assumption, the interframe distance between each image block and the corresponding block of the adjacent frame is then estimated. Finally, the interframe distance of each adjacent frame is estimated by averaging the estimated block-wise interframe distances. Experimental results showed that the proposed algorithm was effective in estimating the interframe distances of the test sequences and the 3D images reconstructed using the proposed method were nearly identical to the original ones.
We present an efficient algorithm using a region-based texture feature for the extraction of texture regions. The key idea of this algorithm is based on the fact that most of the variations of local correlation coefficients (LCCs) according to different orientations are clearly larger in texture regions than in shade regions. An object image is first segmented into homogeneous regions. The variations of LCCs are next averaged in each segmented region. Based on the averaged variations of LCCs, each region is then classified as a texture or shade region. The threshold for classification is found automatically by an iterative threshold selection technique. In order to evaluate the performance of the proposed algorithm, we use six test images (Lena, Woman, Tank, Jet, Face and Tree) of 256 X 256 8-bit pixels. Experimental results show that the proposed feature suitably extracts the regions that appear visually as texture regions.
We propose an efficient vehicle detection and classification algorithm using shadow robust feature for an electronic toll collection. The local correlation coefficient between wavelet transformed input and reference images is used as such a feature, which takes advantage of textural similarity. The usefulness of the proposed feature is analyzed qualitatively by comparing the feature with the local variance of a difference image, and is verified by measuring the improvements in the separability of vehicle from shadowy or shadowless road for a real test image. Experimental results from field tests show that the proposed vehicle detection and classification algorithm performs well even under abrupt intensity change due to the characteristics of sensor and occurrence of shadow.
We propose an iterative algorithm for reducing the blocking artifact in block transform-coded images by using a wavelet transform (WT). An image is considered as a set of one- dimensional (1-D) horizontal and vertical signals and 1-D WT is utilized in which the mother wavelet is the first order derivative of a Gaussian like function. The blocking artifact is reduced by removing the blocking component that causes the variance at the block boundary position in the first scale wavelet domain to be abnormally higher than those at the other positions using a minimum mean square error (MMSE) filter in the wavelet domain. This filter minimizes the MSE between the ideal blocking component-free signal and the restored signal in the neighborhood of block boundaries in the wavelet domain. It also uses local variance in the wavelet domain for pixel adaptive processing. The filtering and the projection onto a convex set of quantization constraint are iteratively performed in alternating fashion. Experimental results show that the proposed method yields not only a PSNR improvement of about 0.46 - 1.26 dB, but also subjective quality nearly free of the blocking artifact and edge blur.
An efficient video segmentation algorithm with homogeneity measure to incorporate spatial proximity, color, and motion information simultaneously is presented for region-based coding. The procedure toward complete segmentation consists of two steps: primary segmentation, and secondary segmentation. In the primary segmentation, an input image is finely segmented by FSCL. In the secondary segmentation, a lot of small regions and similar regions generated in the preceding step are eliminated or merged by a fast RSST. Through some experiments, it is found that the proposed algorithm produces efficient segmentation results and the video coding system with this algorithm yields visually acceptable quality and PSNR equals 36 - 37 dB at a very low bitrate of about 13.2 kbits/s.
A new 3-D segmentation-based coding technique is proposed for transmitting the motion video with reasonablly acceptable quality even at a very low bit rate. Only meaningful motion areas are extracted by using two change detection masks and a current frame is directly segmented rather than a difference frame itself so that a good image quality can be obtained at high compression ratios. Through the experiments, the sequence of Miss America is reconstructed with visually acceptable quality at the very high compression ratio of 360 : 1.