This paper proposes a new technique for rectangle detection in images based on hierarchical feature complexity. The algorithm follows a bottom-up/top-down approach: in the bottom-up phase, contour curves are extracted and its edges are fit to straight lines. Long contours may grow away from the object boundary and they may not complete a loop due to missing edges. The proposed algorithm introduces a solution to such problems in the top-down phase through two simple rules. First, contours are split into segments at the point where non-convexity occurs since this is the point where long contours depart from the object boundary. Second, the split segments are classified into six classes according to their probability of being a rectangle depending on the numbers of the segment sides and right angles they enclose. These classes are then completed into rectangles by searching for suitable lines that may have been missed during the bottom-up phase. The method is verified through experiments on a set of images covering several applications. The results are compared to state of the art methods and benchmarked to groundtruth.
The recent popularity of digital cameras allows us to take a large number of images. There is an increasing need
for efficiently and accurately retrieving images containing a specific person from such image collections. While
only the visual features of the specific person are used in many query-by-example retrieval methods, we focus on
the fact that some people such as family or friends are more likely to appear in the same images than others and
use visual features of not only the queried person but also people who have strong co-occurrence relations with
the queried person to improve the retrieval performance. The relevance feedback is used to learn who co-occur
with the queried person in the same images, their faces, and the strength of their co-occurrence relations. For 116
images collected from 6 persons, after five feedback iterations, the recall rate of 53% was obtained by considering
the co-occurrence relations among people, as against 34% when using only features of the queried person.
There are several zoom-in video display methods including full-zoom and fisheye view that magnify the regions of interest (ROIs). However, those methods usually discard or deform the remaining regions without considering their content. In this paper, we propose a method for generating a content-preserving zoom-in view which magnifies ROIs and at the same time preserves the content of the remaining regions. Targeting on surveillance videos, our method firstly extracts moving objects from every input frame as ROIs. Then, the importance score is calculated for each pixel in the input frame based on its content to determine where the deformation, which may cause the destruction of the content, should be avoided. Finally, a mapping problem from the input frame to the zoom-in view with respect to the importance score is formulated to deform less important regions more than the important ones. Experiments are conducted to study the effectiveness of considering the content importance. We also compare the results of our method with those of other methods, fisheye view and a method of using uniform scaling and seam carving.
In this paper, we extend Isotropy-based LSB steganalysis for detecting the existence of hidden message and estimating
hidden message length when the embedding is performed using both of two distinct embedding paradigms in one or
more than one LSB. The extended method is based on the analysis of image isotropy, which is usually affected by secret
message embedding. The proposed method is general framework because it encompasses a more general case of LSB
steganalysis, namely when the embedding is performed in more LSBs and using both embedding paradigms. Compared
with our previous proposed weighted stego-image based method, the detection accuracy is high. Experimental results
and theoretical verification show that this framework is an effective framework of LSB steganalysis.
In this paper, we propose a method to retrieve semantically similar scenes to a query video from large scale video databases at high speed. Our method uses the audio features and the color histogram as the visual feature because the audio signal is closely related with the semantic content of videos and the color is an extensively used feature for content-based image retrieval systems. The feature vectors are extracted from video segments called packets and clustered in the feature vector space and transformed into <i>symbols</i> that represent the cluster IDs. Consequently, a video is expressed as a symbol sequence based on audio and visual features. Quick retrieval of similar scenes can be realized by symbol sequence matching. We conduct some experiments using audio, visual, and both features, and examine the effect of each feature on videos of various genres.