Owing to the advent of multimedia, simultaneous access 'to text, image, sound, graphics and video has now became been made possible. However, video on computers is a very recent capability. It is achieved thanks to the progress made in hardware components, digitization equipments, storage media (CD ROM) and, above all, image compression techniques (JPEG 1 for still pictures and MPEG 2for motion pictures). The representational power of video makes it the favourite medium for many multimedia applications. The various media are not used in the same way. A structured text (with a table of contents), a still, graphics or sound (with representative graph) may be seen as a whole, allowing them to be handled (for editing purposes for example). Furthermore, these media support information-based navigation (hypertext 3 is a good example of this capability). On the other hand, video is basically a sequential medium. Operations such as fast-forward, rewind and pause allow only limited interactivity when consulting video. This lack of interactivity is due to the fact that video information, unlike structured text, is delivered "flat", with no indications. In order to have the same capacities as in a structured text, it is necessary to have an interpretation level above the raw video information. In the case of structured text, the interpretation level corresponds to paragraph and chapter titles and to the table of contents. Furthermore, text has a basic semantic entity : the word which is used with the search function and the index to create a multiplicity of navigational possibilities through the information. The goal of our work is aimed to offer ways of accessing videos which are similar to those of structured texts. This will make it possible to: .quickly become acquainted with the video content, .quickly access the part of the video of particular interest and .haveentry points in the video. Our methodology is comprised of two stages. In the first stage, we seek to extract the maximum information from the video by using digital image processing techniques. This stage enables us to obtain a layer of interpretation of the raw video data. In the second stage, an operator builds a high level video representation using the results of the first stage. In the next paragraph, we introduce the terms used throughout this paper. Related work is presented in the third paragraph, our own work is described in the fourth one. The conclusion draws a picture of our present work and the potential developments of it.