PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.
In recent years, databases have evolved from storing pure textual information to storing multimedia information -- text, audio, video, and images. With such databases comes the need for a richer set of search keys that include keywords, shapes, sounds, examples, sketches, color, texture and motion. In this paper we address the problem of image retrieval where keys are object shapes or user sketches. In our scheme, shape features are extracted from each image as it is stored. The image is first segmented and points of high curvature are extracted. Regions surrounding the points of high curvature are used to compute feature values by comparing the regions with a number of references. The references themselves are picked out from the set of orthonormal wavelet basis vectors. An ordered set of distance measures between each local region and the wavelet references form a feature vector. When a user queries the database through a sketch, the feature vectors for high curvature points on the sketch are determined. An efficient nearest neighbor search then yields a set of images which contain objects that match the user's sketch closely. The process is completely automated. Initial experimental results are presented.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper we combine image feature extraction with indexing techniques for efficient retrieval in large texture images databases. A 2D image signal is processed using a set of Gabor filters to derive a 120 component feature vector representing the image. The feature components are ordered based on the relative importance in characterizing a given texture pattern, and this facilitates the development of efficient indexing mechanisms. We propose three different sets of indexing features based on the best feature, the average feature and a combination of both. We investigate the tradeoff between accuracy and discriminating power using these different indexing approaches, and conclude that the combination of best feature and the average feature gives the best results.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Advances in technologies for scanning, networking, and CD-ROM, lower prices for large disk storage, and acceptance of common image compression and file formats have contributed to an increase in the number, size, and uses of on-line image collections. New tools are needed to help users create, manage, and retrieve images from these collections. We are developing QBIC (query by image content), a prototype system that allows a user to create and query image databases in which the image content -- the colors, textures, shapes, and layout of images and the objects they contain -- is used as the basis of queries. This paper describes two sets of algorithms in QBIC. The first are methods that allow `query by color drawing,' a form of query in which a user draws an approximate color version of an image, and similar images are retrieved. These are automatic algorithms in the sense that no user action is necessary during database population. Secondly, we describe algorithms for semi-automatic identification of image objects during database population, improving the speed and usability of this manually-intensive step. Once outlined, detailed queries on the content-properties of these individual objects can be made at query time.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
As digital images are progressing into the mainstream of information systems, managing and manipulating them as images becomes an important issue to be resolved before we can take full advantage of their information content. To achieve content-based image indexing and retrieval, there are active research efforts in developing techniques to utilize visual features. On the other hand, without an effective indexing scheme, any visual content based image retrieval approach will lose its effectiveness as the number of features increases. This paper presents our initial work in developing an efficient indexing scheme using artificial neural network, which focuses on eliminating unlikely candidates rather than pin-pointing the targets directly. Experiment results in retrieving images using this scheme from a prototype visual database system are given.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, multidimensional feature measures of object shapes and feature blobs for retrieval of ceramic artifacts (e.g., plates, vases, and bowls) are proposed. These measures capture the various granularity of image features necessary for representation of complex image objects and their painted designs. Object shape is characterized by region compactness, boundary eccentricity, region moment, and region convexity. High detailed regions are characterized by blob properties such as total blob size, number of blobs, dispersion of blobs, and central moment of blobs. Each set of multiple feature measures jointly forms a 4- dimensional feature vector in a multidimensional feature space. Feature abstraction of complex image details is further improved by the computing feature measurements on sub-resolution images. This allows features of different perceptual scales to be isolated and efficiently abstracted. We have applied our method of image content analysis for retrieval of ceramic artifacts and have shown that multiresolution multidimensional feature measures can adequately retrieve images with high perceptual similarity.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Similarity between images is used for storage and retrieval in image databases. In the literature, several similarity measures have been proposed that may be broadly categorized as: (1) metric based, (2) set-theoretic based, and (3) decision-theoretic based measures. In each category, measured based on crisp logic as well as fuzzy logic are available. In some applications such as image databases, measures based on fuzzy logic would appear to be naturally better suited, although so far no comprehensive experimental study has been undertaken. In this paper, we report results of some of the experiments designed to compare various similarity measures for application to image databases. We are currently working with texture images and intend to work with face images in the near future. As a first step for comparison, the similarity matrices for each of the similarity measures are computed over a set of selected textures and are presented as visual images. Comparative analysis of these images reveals the relative characteristics of each of these measures. Further experiments are needed to study their sensitivity to small changes in images such as illumination, magnification, orientation, etc. We describe these experiments (sensitivity analysis, transition analysis, etc.) that are currently in progress. The results from these experiments offer assistance in choosing the appropriate measure for applications to image databases.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The key question for any image retrieval approach is how to represent the images. We are exploring a new image context vector representation that avoids the need for full image understanding. This representation: (1) is invariant with respect to translation and scaling of (whole) images, (2) is robust with respect to translation, scaling, small rotations, and partial occlusions of objects within images, (3) avoids explicit segmentation into objects, and (4) allows computation of image-query similarity using only about 300 multiplications and additions. A context vector is a high (approximately 300) dimensional vector that can represent images, subimages, or image queries. Image context vectors are an extension of previous work in document retrieval where context vectors were used to represent documents, terms, and queries. The image is first represented as a collection of pairs of features. Each feature pair is then transformed into a 300-dimensional context vector that encodes the feature pair and its orientation. All the vectors for pairs are added together to form the context vector for the entire image. Retrieval order is determined by taking dot products of image context vectors with a query context vector, a fast operation. Results from a first prototype look promising. 119
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper describes the extended model for information retrieval (EMIR) designed for complex information description and retrieval and particularly well suited for image modeling. A main object in the proposed model has a three parts specification: a description that is a list of attributes; a composition that is a list of component objects; and a topology that is a list of semantic relationships between component objects, expressing more semantic aspects of the main object structure. The model is well suited for image modeling for two complementary reasons. On one hand, it can distinguish between an object structure and its contents. This is achieved by relaxing the class-object classical instantiation link; thus allowing objects to have individual non categorized contents rather than those predicted in their classes. On the other hand, images have typically very different individual contents, and, therefore, cannot be easily modeled within a structured database model such as the relational model. The query language is organized according to the three-part organization of the model. A simple query has three parts: description, being some constraints on some attributes values; composition, being a set of sub-queries on the composition part of objects; topology, being the specification of special required links on the results of composition sub-queries.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The database environment for vision research (DEVR) is an entity-oriented scientific database system based on a hierarchical relational data model (HRS). This paper describes the design and implementation of the data definition language, the application programmer's interface, and the query mechanism of the DEVR system. DEVR provides a dynamic data definition language for modeling image and vision data, which can be integrated with existing image processing and vision applications. Schema definitions can be fully interleaved with data manipulation, without requiring recompilation. In addition, DEVR provides a powerful application programmer's interface that regulates data access and schema definition, maintains indexes, and enforces type safety and data integrity. The system supports multi-level queries based on recursive constraint trees. A set of HRS entities of a given type is filtered through a network of constraints corresponding to the parts, properties, and relations of that type. Queries can be constructed interactively with a menu-drive interface, or they can be dynamically generated within a vision application using the programmer's interface. Query objects are persistent and reusable. Users may keep libraries of query templates, which can be built incrementally, tested separately, cloned, and linked together to form more complex queries.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A new feature indexing scheme for binary images is proposed. Using the structures of the conjugate classification of the hexagonal grid, ten intrinsically geometric invariant clusters are identified to partition a binary image into ten feature cluster images. The numbers of feature points in feature images are evaluated. Using the ten integers, a probability model is defined to generate quantitative measurements for feature indexing. This provides intrinsic feature indexing sets for rapid retrieval images based on their contents. Two vectors of twelve probability measurements are used to describe different images in varying sizes and sample pictures and their feature indices are illustrated.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Searching in image databases using image content has made the transition from the laboratory to consumer software. Storm Software is a pioneer in bringing these techniques to shrink- wrapped software applications, and this presentation describes some of the methods we use in our products and some of the experiences we have had in bringing this new technology to consumers. We describe the scope of the problem we are trying to solve as well as some of the algorithms and interfaces we used. We also describe some of the rationales (based on theory as well as on user testing) we had for the various design decisions we made. Finally, we describe some of the challenges and opportunities we see ahead. Descriptions and screen shots of two software products implementing image searching (EasyPhoto and Apple PhotoFlash) are provided. Both products were developed by Storm Software.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Modern computer applications use enormous volumes of rich data like video, still images, and text, as well as more conventional numeric and character data. Managing huge volumes of such diverse data requires a database. Content queries, such as 'find me the color images with red components higher than this threshold,' require that the database system be able to apply the qualification directly. Relational database systems that store images as untyped binary large objects (BLOBS) cannot apply qualifications like this, because the database system does not understand the contents of the BLOB. Object-Relational Database Management Systems (ORDBMS), on the other hand, allow users to extend the set of types and functions known to the database system. Programmers can write code that is dynamically loaded into the database server, and that operates on complex data types such as images. Those functions can be used in standard SQL queries, and the database manager can use new types and function results in indices to support fast queries on complex data. In addition, the query optimizer can be told how expensive the new functions are, so that it chooses an optimal strategy for satisfying complicated queries with many different predicates in their qualifications.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
IBM's Ultimedia Manager is a software product for management and retrieval of image data. The product includes both traditional database search and content based search. Traditional database search allows images to be retrieved by text descriptors or business data such as price, date, and catalog number. Content based search allows retrieval by similarity to a specified color, texture, shape, position or any combination of these. The two can be combined, as in 'retrieve all images with the text `beach' in their description, and sort them in order by how much blue they contain.' Functions are also available for fast browning, and for database navigation. The two main components of Ultimedia Manger are a database population tool to prepare images for query by identifying areas of interest and computing their features, and the query tool for doing retrievals. Application areas include stock photography, electronic libraries, retail, cataloging, and business graphics.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The usefulness of a collection of scanned graphical documents can be measured by the facilities available for their retrieval. We present an approach for indexing a collection of line drawings automatically. The indexing is based on the textual and graphical content of the drawings. This approach has been developed to facilitate `retrieval by example' in heterogeneous collections of graphical documents. No a priori knowledge about the application domain is assumed. Starting with a raster image, candidate character patterns and graphical primitives (i.e., line segments and arcs) are extracted. Candidate character patterns are classified by an OCR method and grouped into word hypotheses. Graphical features of various types are computed from groupings of graphical primitives (e.g., sequences of adjacent lines, pairs of parallel lines). Retrieval occurs with a weighted information retrieval system. Each document of the collection and each query are described with a set of indexing features with their corresponding weights. The weight of an indexing feature reflects the descriptive nature of the feature and is computed from the number of occurrences of the indexing feature in the document (feature frequency ff) and the number of documents containing the indexing feature (document frequency df).
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Current feature-based image databases can typically perform efficient and effective searches on scalar feature information. However, many important features, such as graphs, histograms, and probability density functions, have more complex structure. Mechanisms to manipulate complex feature data are not currently well understood and must be further developed. The work we discuss in this paper explores techniques for the exploitation of spectral distribution information in a feature-based image database. A six band image was segmented into regions and spectral information for each region was maintained. A similarity measure for the spectral information is proposed and experiments are conducted to test its effectiveness. The objective of our current work is to determine if these techniques are effective and efficient at managing this type of image feature data.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
CANDID (comparison algorithm for navigating digital image databases) was developed to enable content-based retrieval of digital imagery from large databases using a query-by- example methodology. A user provides an example image to the system, and images in the database that are similar to that example are retrieved. The development of CANDID was inspired by the N-gram approach to document fingerprinting, where a `global signature' is computed for every document in a database and these signatures are compared to one another to determine the similarity between any two documents. CANDID computes a global signature for every image in a database, where the signature is derived from various image features such as localized texture, shape, or color information. A distance between probability density functions of feature vectors is then used to compare signatures. In this paper, we present CANDID and highlight two results from our current research: subtracting a `background' signature from every signature in a database in an attempt to improve system performance when using inner-product similarity measures, and visualizing the contribution of individual pixels in the matching process. These ideas are applicable to any histogram-based comparison technique.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This research explores the interaction of linguistic and photographic information in an integrated text/image database. By utilizing linguistic descriptions of a picture (speech and text input) coordinated with pointing references to the picture, we extract information useful in two aspects: image interpretation and image retrieval. In the image interpretation phase, objects and regions mentioned in the text are identified; the annotated image is stored in a database for future use. We incorporate techniques from our previous research on photo understanding using accompanying text: a system, PICTION, which identifies human faces in a newspaper photograph based on the caption. In the image retrieval phase, images matching natural language queries are presented to a user in a ranked order. This phase combines the output of (1) the image interpretation/annotation phase, (2) statistical text retrieval methods, and (3) image retrieval methods (e.g., color indexing). The system allows both point and click querying on a given image as well as intelligent querying across the entire text/image database.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Many applications require similarity retrieval of an image from a large collection of images. In such cases, image indexing becomes important for efficient organization and retrieval of images. This paper addresses this issue in the context of a database of signature images and describes a system for similarity retrieval and recognition of signature images. The proposed system uses a set of geometric and topological features to map a signature image into two strings of finite symbols. A local associative indexing scheme is then used on the strings to organize and search the signature database. The advantage of the local associative indexing is that it is tolerant of missing features and allows queries even with partial signatures. The performance of the system has been tested with promising results with a signature database of 120 signatures.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We present a prototype video database system designed to accept video sequences as well as still images. The system indexes these sequences based on scene changes, creates a primitive structure of these sequences, and searches this structure for queried objects using specific color features. A video sequence input to the database is first indexed into subsequences using a color histogram difference method. A hierarchical structure is created by thresholding the sequences at various levels of inter-frame difference. For every subsequence that is identified, the first frame in that subsequence, the representative frame, is entered into the database. The system then automatically generates a description for the frame in terms of its color histogram features. Subsequently, the video sequence may be searched for objects (specified as regions of other video sequence frames or still images) using color similarity matching.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper describes the VideoSTAR experimental database system that is being designed to support video applications in sharing and reusing video data and meta-data. VideoSTAR provides four different repositories: for media files, virtual documents, video structures, and video annotations/user indexes. It also provides a generic video data model relating data in the different repositories to each other, and it offers a powerful application interface. VideoSTAR concepts have been evaluated by developing a number of experimental video tools, such as a video player, a video annotator, a video authoring tool, a video structure and contents browser, and a video query tool.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
For developing advanced query formulation methods for general multimedia data, we describe the issues related to video data. We distinguish between the requirements for image retrieval and video retrieval by identifying queryable attributes unique to video data, namely audio, temporal structure, motion, and events. Our approach is based on visual query methods to describe predicates interactively while providing feedback that is as similar as possible to the video data. An initial prototype of our visual query system for video data is presented.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A video database provides content based access to video. This is achieved by organizing or indexing video data based on some set of features. This paper defines the problem of video indexing based on video data models. The procedure required to index video data is outlined. The use of semi-automatic techniques to speed up the indexing processes are explored. These techniques use image motion features to aid in the indexing process. The techniques developed have been applied to video data from cable television feed.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Information retrieval in video document archives presents specific issues. One of them is that a video document contains a large amount of information and can be seen under different aspects. We propose an information retrieval process model based on the cooperation between different specialists: specialists in the application domain, specialists of the media, and specialists in information retrieval. Each specialist has a proper point of view on documents, a partial knowledge which can be exploited in the query interpretation and during the search, and a particular role to play in the different stages of the retrieval process. A facetted data model helps to refine documents descriptions and search results. Each facet can be linked to one structure level of video documents. During the retrieval process, a flexible collaboration between several information retrieval experts is set up to deal with the different aspects of documents and query descriptions and to improve retrieval performance. A prototype, using the MHEG standard, is being implemented to retrieve TV news sequences and to present search results in a hypermedia form.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
As video information proliferates, managing video sources becomes increasingly important. Automatic video partitioning is a prerequisite for organizing and indexing video sources. Several methods have been introduced to tackle this problem, e.g., pairwise and histogram comparisons. Each has advantages, but all are slow because they entail inspection of entire images. Furthermore none of these methods have been able to define camera break and gradual transition, which are basic concepts for partitioning. In this paper, we attempt to define camera break. Then, based on our definition and probability analysis, we propose a new video partitioning algorithm, called NET Comparison (NC), which compares the pixels along predefined net lines. In this way, only part of the image is inspected during classification. We compare the effectiveness of our method with other algorithms such as pairwise, likelihood and histogram comparisons, evaluating them on the basis of a large set of varied image sequences that include camera movements, zooming, moving objects, deformed objects and video with degraded image quality. Both gray-level and HSV images were tested and our method out-performed existing approaches in speed and accuracy. On average, our method processes images two to three times faster than the best existing approach.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
One of the challenging problems in video databases is the organization of video information. Segmenting a video into a number of clips and characterizing each clip has been suggested as one mechanism for organizing video information. This approach requires a suitable method to automatically locate cut points in a video. One way of finding such cut points is to determine the boundaries between consecutive camera shots. In this paper, we address this as a statistical hypothesis testing problem and present three tests to determine cut locations. All the three tests are such that they can be applied directly to the compressed video. This avoids an unnecessary decompression-compression cycle, since it is common to store and transmit digital video in compressed form. As our experimental results indicate, the statistical approach permits accurate detection of scene changes induced through straight as well as optical cuts.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents a video indexing and representation tool for sequences which contain moving persons, using a model-based dynamic scene analysis. A scenario, describing the sequence in terms of basic events, is proposed. These events constitute a first level of annotation and are used to build a visual representation of the sequence called Object Based Video Icon. Experiments are carried out and a prototype system is described.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
As a first step to creating a video database, a video sequence has to be segmented into several subsequences based on significant changes in the scene. This enables the media user to identify the whole or a part of the sequence and to retrieve scenes of interest from a large video database. Researchers in the past have used a histogram based inter-frame difference approach to identify significant scene changes. To determine which is the best color coordinate system for video indexing, we have evaluated the histogram based indexing method using different color coordinate systems -- RGB, HSV, YIQ, L*a*b*, L*u*v* & Munsell -- and compared the results for accuracy of indexing with reference to subjective indexing. Since it is difficult to determine the exact threshold value to obtain reasonably good results, we also propose a vide segmenting method called hierarchical histogram based indexing that segments a video sequence into several levels of subsequences using different levels of threshold.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Video editors are frequently required to access sections of a video sequence which contain a particular scene. This may be regarded as an image retrieval-by-content problem where the user wishes to select images from within a large database according to a measure of similarity to a target. We present an intelligent video editing system based on a neural network coding scheme. The transformation learnt by the neural network maps each image into a very compact index which supports rapid fuzzy matching of video images. The neural network is trained using a learning law which produces an information preserving transform. Trained in this way, the node learns features which characterize the distribution of scenes within the video sequence. Each image frame in the sequence is coded with respect to these features. We show how the system performs on a typical sequence of newsreel footage and discuss the factors affecting the performance of both the training and the retrieval mechanism.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A prototype object-oriented multimedia database management system currently being developed is described. The system supports the storage and retrieval of images, video, audio and documents composed of these types. The major features of the system include: (1) content-based indexes for each of the data types, (2) an intuitive user-oriented query language based on these indexes, (3) manual, semi-automatic and automatic indexing modes, (4) object- based user data models incorporated in query processing, (5) image/audio/video processing incorporated in the system, (6) versioning of objects, (7) browsing and navigation facilities. The indexes are interval-based and describe spatio-temporal relations between pairs of objects in the respective media. The query processing mechanism is described, as is the object- oriented data modeling facility. The most innovative aspects of this work are the following: (1) extension of iconic indexing of images to the audio and video data types, (2) an embedding of content-based iconic indexing in a multimedia database management system with particular emphasis on user-oriented indexing and querying, (3) the use of an object-oriented data model to alleviate the aliasing problem in query formation, (4) versioning of images/audio/video to save storage space.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We work towards a content-based image retrieval system, where queries can be image-like objects. At entry time, each image is processed to yield a large number of indices into its windows. A window is a square in a fixed quad-tree decomposition of the image, and an index is a fixed-size vector, called a descriptor, similar to the periodograms used in spectral estimation. The fixed decomposition of images was prompted by the need for fast processing, but leads to windows that often straddle image regions with different textural contents, making indices less effective. In this paper, we investigate different definitions of spectral distance which we plan to use to classify windows according to their texture content.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Similarity based retrieval of images is an important task in multimedia applications. A major class of users queries require retrieving those images in the database that are spatially similar to the query image. To process these queries, a spatial similarity function is desired. A spatial similarity function assesses the degree to which the spatial relationships in a database image conform to those specified in the query image. In this paper, we formalize the notion of spatial similarity for 2D symbolic images and provide a framework for characterizing the robustness of spatial similarity algorithms with respect to their ability to deal with translation, scale, rotation (both perfect and multiple) variants as well as the variants obtained by an arbitrary composition of translation, scale, and rotation. This characterization in turn is useful for comparing various algorithms for spatial similarity systematically. As an example, a few spatial similarity algorithms are characterized and then experimentally contrasted using a testbed of images.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Recently, several image indexing techniques have been reported in the literature. However, these techniques require a large amount of off-line processing, additional storage space and may not be applicable to images stored in compressed form. In this paper, we propose two efficient techniques based on vector quantization (VQ) for image indexing. In VQ, the image to be compressed is decomposed into L-dimensional vectors. Each vector is mapped onto one of a finite set (codebook) of reproduction vectors (codewords). The labels of the codewords are used to represent the image. In the first technique, for each codeword in the codebook, a histogram is generated and stored along with the codeword. We note that the superposition of the histograms of the codewords, which are used to represent an image, is a close approximation of the histogram of the image. This histogram is used as an index to store and retrieve the image. In the second technique, the histogram of the labels of an image is used as an in index to access the image. The proposed techniques provide fast access to the images in the database, have lower storage requirements and combine image compression with image indexing. Simulation results confirm the gains of the proposed techniques in comparison with other techniques reported in the literature.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We describe two new color indexing techniques. The first one is a more robust version of the commonly used color histogram indexing. In the index we store the cumulative color histograms. The L1-, L2-, L(infinity )-distance between two cumulative color histograms can be used to define a similarity measure of these two color distributions. We show that this method produces slightly better results than color histogram methods, but it is significantly more robust with respect to the quantization parameter of the histograms. The second technique is an example of a new approach to color indexing. Instead of storing the complete color distributions, the index contains only their dominant features. We implement this approach by storing the first three moments of each color channel of an image in the index, i.e., for a HSV image we store only 9 floating point numbers per image. The similarity function which is used for the retrieval is a weighted sum of the absolute differences between corresponding moments. Our tests clearly demonstrate that a retrieval based on this technique produces better results and runs faster than the histogram-based methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we study the possibility of using θ-index for retrieval of imagebase. Image retrieval is related to partition in an image space. If there exists a mapping from an image space to a feature space, the partition in the image space can be converted into a partition in the feature space. To retrieve an image, some features of the image must be identified. Feature identification can be considered as optimal assignment of some indices to images. This is equivalent to an optimal partition of image space into mutual exclusive regions, each corresponding to particular values of a set of indices. In our system, the feature extraction is implemented by a two-dimensional θ-transformation. The time complexity for a θ-transformation is very low. Therefore, for a certain class of images, θ- transformation is a very efficient algorithm for image indexing.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In order to retrieve a set of intended images from a huge image archive, human beings think of special contents with respect to the searched scene, like a countryside or a technical drawing. Therefore, in general it is harder to retrieve images by using a syntactical feature- based language than a language which offers the selection of examples concerning color, texture, and contour in combination with natural language concepts. This motivation leads to a content-based image analysis and goes on to a content-based storage and retrieval of images. Furthermore, it is unreasonable for any human being to make the content description for thousands of images manually. From this point of view, the project IRIS (image retrieval for information systems) combines well-known methods and techniques in computer vision and AI in a new way to generate content descriptions of images in a textual form automatically. IRIS retrieves the images by means of text retrieval realized by the SearchManager/6000. The textual description is generated by four sub-steps: feature extraction like colors, textures, and contours, segmentation, and interpretation of part-whole relations. The system is implemented on IBM RS/6000 using AIX. It has already been tested with 350 images.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A successful visual database system must provide facilities to manage both image data and the products extracted from them. The extracted items usually consist of textual and numeric data from which multiple visualizations can be created. Such visualizations are difficult to automate because they are domain-specific and often require data from multiple sources. In the Database Environment for Vision Research (DEVR) we address these issues. DEVR is an entity- oriented, scientific, visual database system. In DEVR, entities are stored in hierarchical, relational data structures. The schema for each entity contains a name, a set of properties, a set of parts, a set of attributed relations among the parts and a set of graphic definitions which describe how to build instance-specific visualizations. Graphic definitions are composed of one or more graphical primitives. For each primitive, the user identifies required data sources by graphically selecting various properties or parts within the schema hierarchy. As instances are created, the graphic definitions are used to automatically generate visualizations, which can later be viewed via a graphical browser. In this paper, we describe the visualization subsystem of the DEVR system, including schema construction, graphical definition, and instance browsing.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Multimedia involves the use of multiple forms of communication media in an interactive and integrated manner. At present, textual data is the media predominantly used to provide the interactivity due to the ease with which discrete semantic elements are identified. It is common practice to follow links from words or phrases within text to associated information elsewhere in the database. To achieve a similar degree of functionality with visual information typically requires that each image (or video sequence) be processed by hand, indicating the objects and locations within the image -- a process that is excessively expensive and time-consuming for large databases. This paper describes the implementation of a simple object recognition system that allows the specification of 3D models that can then be used to recognize objects within any image, in an analogous fashion to words within text. This enables image data to become a truly active media, within a multimedia database. It provides a significantly enhanced level of functionality while keeping the authoring effort to a minimum. The basic algorithms are described and then an example application is outlined, along with feedback from users of the system.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Indexing or retrieving information from a database of images in response to a query is an important problem in the on-line maintenance of large volumes of documents depicting images, graphics and text. A key component of image indexing is the selection of image regions that are likely to contain the queried object. In this paper we propose attentional selection as a paradigm for selection during image indexing. Specifically, we present an implementation of a model of attentional selection to perform indexing in the domain of technical manual documents depicting line drawing images of physical equipment. The indexing system developed selects regions containing a 3D machine part in relevant pages of the manual in response to a query describing the part. Model-based object recognition then confirms the presence of the part at that location by solving for the pose of the queried object.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Data hiding is the process of embedding data into image and audio signals. The process is constrained by the quantity of data, the need for invariance of the data under conditions where the `host' signal is subject to distortions, e.g., compression, and the degree to which the data must be immune to interception, modification, or removal. We explore both traditional and novel techniques for addressing the data hiding process and evaluate these techniques in light of three applications: copyright protecting, tamper-proofing, and augmentation data embedding.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper we present the application of the hypermedia system in a prototype that has new kinds of nodes and links for integration of plural multimedia databases. Usually, multimedia databases are independent of each other. They manage different contents and attributes of multimedia data by each different schema. Applications to retrieve from them have particular functions for efficient retrieval. In such a situation, a user would like to integrate the databases. For example, to retrieve a part number from a machine part database and to use the part number for ordering. It has been a difficult task without total reconstruction of the databases. To solve this problem, we propose hypermedia-based integration of plural multimedia databases, without changing the schema and interfaces of the already developed databases, but making linkage between them. The hypermedia system developed at our laboratory features integrating applications, and sharing the integrated applications. By using this system, it becomes possible that a node is what an application does and a link is a relation of what an application does to another one. Then, by linking these nodes, it becomes possible that a series of retrieval for multimedia databases is performed as one retrieval.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.