A depth-fused 3-D (DFD) display, which is composed of two 2-D images displayed at different depths, is a new 3-D display proposed recently and enables an observer using no extra equipment to perceive an apparent 3-D image. The original data for it are a 2-D image of objects and a depth map image of objects. The two 2-D images are formed by dividing the luminance of an original 2-D image between the two 2-D images according to a depth data of objects at each pixel. This paper presents the effect of the compression of the depth map image on a DFD image. We studied on still pictures using JPEG as an algorithm for compression. After decoding the depth map image, 3-D images were displayed forming the two 2-D images. The main result obtained from subjective evaluations is that the effect of the compression noises appearing on its decoded image appears as errors of position in depth on DFD image, however, a higher compression rate is possible for depth map image than for conventional 2-D image. This result shows that is is advantageous to transmit or store the original data before forming the two 2-D images.
We are developing a multiple-angle 3D-video system that will allow an audience to enjoy a concert in real time at a distant location by simultaneously shooting multiple stereoscopic images from different angles and transmitting them through a high-speed network. At the receiving side, a decoder restores the original image signals from the received data, and different video images are shown on multiple stereoscopic displays. As a result, the audience can enjoy multiple images of a concert at the same time. The purpose of this system is to give a remote audience the opportunity to enjoy a live concert as though they were at the concert site. This paper examines the technical requirements for the camera arrangement and the transmission rate needed to transmit images of a musical performance with this system. We have built an experimental system with four stereo cameras, and have experimentally transmitted musical performance images to test the system performance. In this paper, we also propose a method for transmitting multiple camera images more efficiently.
This paper reports a study on new, region-wide search and pursuit system for missing objects such as stolen cars, wandering people, etc. By using image matching processes on the basis of the object properties such as color and shape, the intelligent camera can search the object. Then the camera transmits the properties to the next camera to pursue the object successively. The experimental results show that the system can judge 2 cars as search object among 40 cars under conditions of changing environment. Based on these data the proposed system can accomplish a fundamental step. Finally, research subjects have been picked up for advancement such as accurate shape extraction processing, camera structure for high speed processing and multimedia attributes such as sound.
Digitalized terrestrial broadcast can offer many programs to the TV viewer like communication satellite broadcasting. And it will spread in Japan. And image retrieval is necessary for us to watch many programs. Thereupon, we proposed method that particular scene is retrieved from video image. That retrieval method is to use the motion vector contained in MPEG-2 stream. Concretely, the movement of image is defined as the characteristics from information of motion vector. And the characteristic of a retrieval object and retrieval query scene are compared. In this way, target scene is retrieved. And we examined about efficacy of this retrieval method by using a television image.
In order to improve speech recognition efficiency under background noise such as out of doors, we propose a new recognition technology combined with three dimensional lip movements. In this paper firstly, three-dimensional movements at four positions of the mouth were measured using principal component analysis to clarify which positions and which directions are main contributors to pronunciation. Secondly, recognition evaluation tests for 50 Japanese words were carried out under noise levels ranging from 40 to 80 dB. In the experiment, over 80% of recognition efficiency was measured at 70dB and improvement of 40% was obtained compared with ordinary speech recognition. From the experimental results, the proposed method can be modified to be used as practical speech recognition technology. Finally, research subjects were picked up such as an improvement in precision of measuring lip movement and experiments and data collection out of doors.
Recently, various kinds of multimedia application systems have actively been developed based on the achievement of advanced high sped communication networks, computer processing technologies, and digital contents-handling technologies. Under this background, this paper proposed a new distributed multimedia database system which can effectively perform a new function of cooperative retrieval among distributed databases. The proposed system introduces a new concept of 'Retrieval manager' which functions as an intelligent controller so that the user can recognize a set of distributed databases as one logical database. The logical database dynamically generates and performs a preferred combination of retrieving parameters on the basis of both directory data and the system environment. Moreover, a concept of 'domain' is defined in the system as a managing unit of retrieval. The retrieval can effectively be performed by cooperation of processing among multiple domains. Communication language and protocols are also defined in the system. These are used in every action for communications in the system. A language interpreter in each machine translates a communication language into an internal language used in each machine. Using the language interpreter, internal processing, such internal modules as DBMS and user interface modules can freely be selected. A concept of 'content-set' is also introduced. A content-set is defined as a package of contents. Contents in the content-set are related to each other. The system handles a content-set as one object. The user terminal can effectively control the displaying of retrieved contents, referring to data indicating the relation of the contents in the content- set. In order to verify the function of the proposed system, a networked electronic museum was experimentally built. The results of this experiment indicate that the proposed system can effectively retrieve the objective contents under the control to a number of distributed domains. The result also indicate that the system can effectively work even if the system becomes large.
In order to assist the people who have a tendency to wander away, a human tracking system using wireless multimedia technology was studied. This system consists of a compact mobile unit which can be carried in one's pocket and a base unit at home, both of which are connected through mobile public network. The positioning performance can be realized using GPS. In this paper first a tracking system structure and a performance principle are described. From the fundamental experiments, such as position measurement and position data transmission, the system can be modified to be used as human tracking system. Moreover, several subjects, such as compactness of the mobile unit, improvement of position precision, and high reliability of human tracking continue to be studied.
This paper proposes a new approach to automatically extract an accurate object from video streams. The new approach provides a useful tool creating linking information for a distributed movie-based Web-browsing system, and consists of a skip- labeling algorithm for feature-based segmentation, and a shrink-merge tracking algorithm for tracking an object. This skip-labeling algorithm can be used to segment an image into integrated regions of the same feature. The segmented regions belong to such a texture area as waves or forest. The shrink- merge tracking algorithm is executed, based on the time continuity of moving-objects, using morphological image processing, such as dilation and erosion. The dilation and erosion are repeatedly executed using the projection processing in which the object area in a next frame is derived from the object area in a current frame. The shrink-merge tracking algorithm can also project the area of a rotating- object in a current frame on the rotating-object containing the newly appearing regions in the next frame. The newly automated object extraction method works satisfactorily for the objects which move non-linearly within the video streams including MPEG and Motion JPEG, and works satisfactorily in approximately 450 frames, each with a full frame size of 704 X 480 pixels at video frame rate of 30 fps. This paper finally demonstrates that object-based linking information for a movie-based Web-browsing system contains information of objects obtained by the fully automated extraction from video- streams.
This paper proposes a new distributed multimedia data base system where the databases storing MPEG-2 videos and/or super high definition images are connected together through the B-ISDN's, and also refers to an example of the networking of museums on the basis of the proposed database system. The proposed database system introduces a new concept of the 'retrieval manager' which functions an intelligent controller so that the user can recognize a set of image databases as one logical database. A user terminal issues a request to retrieve contents to the retrieval manager which is located in the nearest place to the user terminal on the network. Then, the retrieved contents are directly sent through the B-ISDN's to the user terminal from the server which stores the designated contents. In this case, the designated logical data base dynamically generates the best combination of such a retrieving parameter as a data transfer path referring to directly or data on the basis of the environment of the system. The generated retrieving parameter is then executed to select the most suitable data transfer path on the network. Therefore, the best combination of these parameters fits to the distributed multimedia database system.
This paper presents the significance of a multimedia medical consulting system together with the recent related developments and studies, and an experiment using a medical consultation via teleconference system. Results revealed that the requirements and subjects for further development.
This paper reports a new contact-type, full-color reading system for compact color facsimiles. This system mainly consists of a direct-contact type image sensor and bright 3-color LED arrays. As the sensor does not require an optical lens system like a rod lens, no color aberrations occur, making this system suitable for color reading. Recent advances in brightness for blue LED have made a 3-color solid state document illuminator possible. An LED illuminator is superior to a fluorescent lamp in uniform illumination, compactness, and longevity. Specifically two points of system structure have been researched. One is an optical system design based on the light transfer performances. The second is a 3-color illuminating method which can reproduce color accurately with high speed scanning. The main experimental results are: (1) reading speed per line is 15 msec which is acceptable for G3 class color facsimile, and (2) resolution is 0.5 at 8 line/mm of spatial frequency for RGB color bar- chart, (3) color difference is 12 which is slightly higher than that of the ordinary systems but can be easily improved through signal processing techniques. Therefore, the proposed reading system is possible for application in compact color facsimiles.