KEYWORDS: Video, Video compression, Wavelet transforms, Image compression, Wavelets, Eye, Visualization, RGB color model, Video processing, Video coding
We present a novel approach to compression of video frames based on the foveation behavior of the human visual system (HSV). Eye fixations on a video frame, as depicted by eye-gaze trace data, define an imaginary region of interest. The perceived resolution of the frame by the human eye depends totally on this eye-gaze (fixation) point. The resolution, then, decreases dramatically with the distance from the fovea. This behavior of the HSV has gained interest in the image and video processing area recently especially in compression of images or video frames. We present an approach where eye-gaze trace data are intergral to the compression process which has demonstrated its usefulness in yielding high compression performance. We partition a video frame into three regions: the inner-most incudes a point of eye-gaze for which we apply lossless compression; an outer region which encompasses the first and for which we apply visually lossless (near-lossless) compression, and finally an outmost region where lossy compression is applied. Because of its low computational complexity, we use the Haar wavelet transform. Preliminary results are promising and show improvement over other methods which are mainly full frame based.
We investigate the visual and vocal modalities of interaction with computer systems. We focus our attention on the integration of visual and vocal interface as possible replacement and/or additional modalities to enhance human-computer interaction. We present a new framework for employing eye gaze as a modality of interface. While voice commands, as means of interaction with computers, have been around for a number of years, integration of both the vocal interface and the visual interface, in terms of detecting user's eye movements through an eye-tracking device, is novel and promises to open the horizons for new applications where a hand-mouse interface provides little or no apparent support to the task to be accomplished. We present an array of applications to illustrate the new framework and eye-voice integration.
Performance results using wavelet and other multiresolution transforms are described. The roles of information content, resolution scale, and image capture noise, are discussed. Delivery systems for a range of large image repositories from areas which include medical, astronomy and graphic arts are described, as standalone systems for use on portable platforms, and client-server systems for use on the web.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.