Search and retrieval of huge archives of Multimedia data is a challenging task. A classification step is often used to
reduce the number of entries on which to perform the subsequent search. In particular, when new entries of the database
are continuously added, a fast classification based on simple threshold evaluation is desirable.
In this work we present a CART-based (Classification And Regression Tree ) classification framework for audio
streams belonging to multimedia databases. The database considered is the Archive of Ethnography and Social History
(AESS) , which is mainly composed of popular songs and other audio records describing the popular traditions
handed down generation by generation, such as traditional fairs, and customs.
The peculiarities of this database are that it is continuously updated; the audio recordings are acquired in unconstrained
environment; and for the non-expert human user is difficult to create the ground truth labels.
In our experiments, half of all the available audio files have been randomly extracted and used as training set. The
remaining ones have been used as test set. The classifier has been trained to distinguish among three different classes:
speech, music, and song. All the audio files in the dataset have been previously manually labeled into the three classes
above defined by domain experts.
The availability of large audio collections calls for ways to efficiently access and explore them by providing an effective overview of their contents at the interface level. In this paper we present an innovative strategy exploiting color to visualize the content of a database of audio records, part of a website dedicated to ethnographic information in a region of Italy.
In this paper we present AVIR (Audio & Video Information Retrieval), a project of CNR (Italian National Research Council) - ITC to develop a tools to support an information system for distance e-learning. AVIR has been designed to store, index, and classify audio and video lessons to make them available to students and other interested users. The core of AVIR is a SDR (Spoken Document Retrieval) system which automatically transcribes the spoken documents into texts and indexes them through dictionaries appropriately created. During the fruition on-line, the user can formulate his queries searching documents by date, professor, title of the lesson or selecting one or more specific words. The results are presented to the users: in case of video lessons the preview of the first frames is shown. Moreover, slides of the lessons and associate papers can be retrieved.
In this paper we present the SIRBeC web site, designed and implemented by CNR - ITC for the Cultural Department of Lombardy in northern Italy. This site allows the consultation, through texts and images, of the cultural heritage present in the Region. The main characteristics of the SIRBeC system are shown, with particular attention for the procedure integrating geographic interrogation of georeferential data with a standard textual query environment.
In this paper we present an extension to Quicklink that allows the user to browse database contents according to criteria regarding the semantics of the objects represented in the images. The Quicklink system retrieves images similar to a query image from large archives of artworks by dynamically matching their accompanying textual descriptions, and presents the results in HTML pages, where the images retrieved are ordered according to their degree of similarity. It is designed to adapt its behavior to user requests through a relevance feedback mechanism. The core of the extension designed and implemented in Quicklink consists in the Visual Dictionary, described here.
The need to retrieval visual information form large image and video collections is shared by many application domains. This paper describes the main features of Quicklook, a system that combines in a single framework the alphanumeric relational query, the content-based image query exploiting automatically computed low-level image features, and the textural similarity query exploiting any textual attached to image database items.
We present here Quicklink, a system that retrieves images similar to a query image in large web archives of artworks by dynamically matching their textual descriptions (usually catalog cards), adapts its behavior to user requests, and presents the retrieval results in HTML pages, where the images are ordered according to their degree of similarity.
Archives of optical documents are more and more massively employed, the demand driven also by the new norms sanctioning the legal value of digital documents, provided they are stored on supports that are physically unalterable. On the supply side there is now a vast and technologically advanced market, where optical memories have solved the problem of the duration and permanence of data at costs comparable to those for magnetic memories. The remaining bottleneck in these systems is the indexing. The indexing of documents with a variable structure, while still not completely automated, can be machine supported to a large degree with evident advantages both in the organization of the work, and in extracting information, providing data that is much more detailed and potentially significant for the user. We present here a system for the automatic registration of correspondence to and from a public office. The system is based on a general methodology for the extraction, indexing, archiving, and retrieval of significant information from semi-structured documents. This information, in our prototype application, is distributed among the database fields of sender, addressee, subject, date, and body of the document.
There is a great demand for efficient tools that can, on the basis of the pictorial content, organize large quantities of images and rapidly retrieve those of interest. With that goal in mind we present a method for indexing complex color images. The basic idea is to exploit image data decomposition and compression based on the standard Haar multiresolution wavelet transform to describe image content. In this way we are able to effectively eliminate data redundancy and concisely represent the salient features of the image in image signatures of predefined lengths. In the retrieval phase image signatures are compared using a similarity measure that the system has 'learned' from user's. Experimental results confirm the feasibility of our approach, which outperforms more standard procedures, in retrieval accuracy and at lower computational costs.