The goal of this work is to explore management strategies and algorithms for large-scale multimedia conferencing over a communication network. Since the use of multimedia conferencing is still limited, the management of such systems has not yet been studied in depth. A well organized and human friendly multimedia conference management should utilize efficiently and fairly its limited resources as well as take into account the requirements of the conference participants. The ability of the management to enforce fair policies and to quickly take into account the participants preferences may even lead to a conference environment that is more pleasant and more effective than a similar face to face meeting. We suggest several principles for defining and solving resource sharing problems in this context. The conference resources which are addressed in this paper are the bandwidth (conference network capacity), time (participants' scheduling) and limitations of audio and visual equipment. The participants' requirements for these resources are defined and translated in terms of Quality of Service requirements and the fairness criteria.
Most conferencing systems are focused on facilitating one of two types of meetings: those in a single room, consisting entirely of collocated participants, or those with isolated individuals at different physical locations. Our experiences are of a third style: hybrid meetings consisting of both collocated groups and isolated participants. We illustrate the limitations of using an existing desktop-based tools in the shared meeting room portion of this hybrid meeting style, and propose adding a software control substrate matched to the specifics of the application to address the inadequacies. We derive requirements for the in-room applications, and, as a concrete example from the domain, describe the design and implementation of an application for manipulation of in-room shared video display. Our design employs a user interface split across multiple physical devices paired with a control protocol managing communication between them. The client portion runs on wirelessly-connected portable devices (laptops and 3Com Palm Pilots) and supports per-user input; the server portion handles presentation of shared output on a video monitor. Our design is optimized for meeting room use in three ways: simplified operation to reduce demands on attention, support for remote control, and support for access by multiple simultaneous users.
The role of traditional multimedia systems has been to disseminate information. The advent of media spaces, however, offers an increased potential in terms of presentation as well as facilitating a natural and intuitive environment for interpersonal communication. Combined with Computer Supported Cooperative Work technologies, these shared, media-rich environments offer a natural basis for distributed collaboration through a seamless blend of presentational, conversational and interactive multimedia. The resulting notion underlies the definition of a collaborative media space in which users interact with each other through the experience and manipulation of shared media. The integration of such a diverse array of entities presents many challenges, ranging from the need to support a variety of media types to managing how objects in such a system interact. Indeed, a primary consideration with such system is the coordination (including both causal and temporal synchronization) of entities within the space. This work addresses how to facilitate media space design by employing a pattern-based meta-level architecture and management infrastructure in which reflection is used to isolate system-level issues such as behavioral coordination from low-level, media-specific computation. The architectural framework and its underlying topology are illustrated along with the model's application to a distance education system.
The Internet Multicast Backbone (MBone) has seen tremendous growth during the past four years. For the past three and a half years, we have broadcast a weekly seminar on the MBone. The effort to set up and produce each broadcast has grown as we worked to improve the broadcasts and to support more services (e.g., on-demand playback, recording, video effect, etc.) This problem motivated us to develop a system to automate the tasks required to produce a broadcast. The motivation, requirements, and implementation of the system are presented.
In this paper, we consider the problem of providing multimedia services to mobile clients, from the viewpoint of designing applications and systems that support session mobility. We present a software architecture for multimedia services that supports session hand-off between service providers, and allows operating parameters of all components of the session to be altered as part of the hand-off. We describe a user interface that simplifies initiating session mobility, while taking into consideration display device capability. We discuss the system support that is necessary, and describe the implementation of our ideas within our software architecture.
We present a new compression algorithm for synthetic images that produces high compression rates by utilizing depth and color information from previously rendered images. Images predicted from prior images are combined with a residual image that may be transmitting from a remote location, to generate new images. The image-based rendering technique provides accurate motion prediction and accelerates rendering at the same time by exploiting temporal coherence. The motion prediction is computed and evaluated in image- order, pixel by pixel, producing residual images that are sparse and do not require address or index data. The system yields a compression ratio improvement of a factor of 4 - 10 over MPEG, in many cases. This approach is attractive for remote rendering applications where a client system may be a relatively low-performance machine and limited network bandwidth makes transmission of large 3D data impractical. The efficiency of the server generally increases with scene complexity or data size since the rendering time is predominantly a function of image size. This technique is also applicable to archiving animation.
In this paper, we introduce an object-oriented image coding algorithm to differentiate regions of interest (ROI) in visual communications. Our scheme is motivated by the fact that in visual communications, image contents (objects) are not equally important. For a given network bandwidth budget, one should give the highest transmission priority to the most interesting object, and serve the remaining ones at lower priorities. We propose a DWT based Multiresolution Markov Random Field technique to segment image objects according to their textures. We show that this technique can effectively distinguish visual objects and assign them different priorities. This scheme can be integrated with our ROI compression coder, the Generalized Self-Similarity Tress codex, for networking applications.
In this paper we present two approaches to combine recent results of cryptographic research with the requirements of modern multimedia systems. The first is to evaluate modern block ciphers in a JAVA-environment. The second approach is based on recent developments regarding fast Luby-Rackoff ciphers. Paradoxically, it deals with doing `high-bandwidth encryption with low-bandwidth smartcards'. Also, we discuss implementation considerations for a specific multimedia project, the multimedia database for teleteaching at the University of Mannheim.
We introduce a new scalable video encoding technique which is geared to support video communication over unreliable channels. Our video encoding technique is based on the octree representation of frames and vector quantization. The algorithms divides the video sequence into blocks and then cubes, which are encoded as a 3D entity. Then, we apply a mean-removal algorithm on each block separately. Then, we generate the octree for each block and use vector quantization to compress the octree data. This resulting encoding is both highly scalable and robust. We make use of the temporal dependency between frames by using a 3D encoding. We show that this encoding is time-efficient to generate and gives a very good compression ratio. In addition, we demonstrate its ability to tolerate and conceal communication errors. Moreover, we illustrate the ability of our encoding technique to react to changing network traffic patterns. The new encoding technique, besides its simplicity and effectiveness, proved to perform well with respect to both scalability and robustness attributes.
An important feature to be considered in the design of multimedia DBMSs is content-based retrieval of images. Most work in this area has focused on feature-based retrieval; we focus on retrieval based on spatial relationship, which include directional and topological relationships. The most common data structure that is used for representing directional relations is the 2D string. The search process, however, is sequential and the technique does not scale up for large databases. We propose a new indexing structure, the 2-D-S-tree, to organize 2-D strings for query efficiency. The 2-D-S-tree is completely dynamic; inserts and deletes can be intermixed with searches and no periodic reorganization is required. A performance analysis is conducted, and both analytical analysis and experimental results indicate that the 2-D-S-tree is an efficient index structure for content-based retrieval of images.
In widely available multimedia digital libraries, querying, browsing, and displaying data pose new challenges given the diversity of the applications, of the users, and of the data. We address some of these challenges by tightly integrating querying with user-defined data presentation and by supporting browsing within query-defined groupings of the multimedia objects. Groupings use efficiently the screen real-estate and enhance the comprehension of the data. The user interface supports virtual document templates for specifying layout with associated visual query boxes for specifying document content. By nesting query boxes the user is able to define how the browsing is going to be performed. The visual nesting `drives' the actual querying process and therefore goes beyond the simple specification of the presentation. Examples are provided of the system as used to query the Perseus digital library of classical artifacts.
We examine the use of the anycasting communication paradigm to improve client performance when accessing replicated multimedia objects. Anycasting supports dynamic selection of a server amongst a group of servers that provide equivalent content. It the selection is done well, the client will experience improved performance. A key issue in anycasting is the method used to maintain performance information used in server selection. We explore using past performance or experience to predict future performance. We conduct our work in the context of a customized web prefetching application called WebSnatcher. We examine a variety of algorithms for selecting a server using past performance and find that the overall average and weighted average algorithms are closest to optimal performance. In addition to the WebSnatcher application, this work has implications for responsible network behavior by other applications that generate network traffic automatically. By using the techniques we present here, such applications can reduce network and server load, potentially improving performance for interactive applications. The results can also be used to reach conclusions about the performance that would be obtained if anycasting were used in an interactive application.
In this paper, we investigate the effectiveness of packet- drop mechanisms in conjunction with fair queuing link scheduling and hierarchical link sharing. Under fair queuing, the link share of a flow changes dynamically due to the arrivals and departures of flows and their bursts. This phenomenon becomes more pronounced in the case of hierarchical link sharing. Packet-drop mechanisms play an integral role for bandwidth-adaptive flows, such as TCP, that are expected to adjust their rates to the flows' changing fair share of the link bandwidth. We show experimentally that, under the existing drop policies (including random early detection and per-flow schemes such as the longest-queue drop) implemented with fair scheduling policies, TCP flows are slow to adjust their rates to their changing share of the link bandwidth. To overcome this problem, we propose a new packet-drop policy that simultaneously exploits two dimensions--when to drop and what to drop. We demonstrate that, under our packet-drop policy and with fair queuing, TCP flows adapt to their changing fair share of the link bandwidth when competing with different types of cross-traffic (e.g., bandwidth adaptive, rate controlled, greedy and on-off). We also illustrate that our drop policy provides isolation and fairness to flows other than TCP (e.g., rate controlled, on- off).
The Internet research community is promoting active queue management in routers as a proactive means of addressing congestion in the Internet. Active queue management mechanisms such as Random Early Detection (RED) work well for TCP flows but can fail in the presence of unresponsive UDP flows. Recent proposals extend RED to strongly favor TCP and TCP-like flows and to actively penalize `misbehaving' flows. This is problematic for multimedia flows that, although potentially well-behaved, do not, or can not, satisfy the definition of a TCP-like flow. In this paper we investigate an extension to RED active queue management called Class-Based Thresholds (CBT). The goal of CBT is to reduce congestion in routers and to protect TCP from all UDP flows while also ensuring acceptable throughput and latency for well-behaved UDP flows. CBT attempts to realize a `better than best effort' service for well-behaved multimedia flows that is comparable to that achieved by a packet or link scheduling discipline, however, CBT does this by queue management rather than by scheduling. We present results of experiments comparing our mechanisms to plain RED and to FRED, a variant of RED designed to ensure fair allocation of bandwidth amongst flows. We also compare CBT to a packet scheduling scheme. The experiments show that CBT (1) realizes protection for TCP, and (2) provides throughput and end-to-end latency for tagged UDP flows, that is better than that under FRED and RED and comparable to that achieved by packet scheduling. Moreover CBT is a lighter-weight mechanism than FRED in terms of its state requirements and implementation complexity.
In a video multicast session, the receivers may possess different capabilities or be connected to the network through a variety of different access speeds. Multicasting video traffic at any one rate to all of the receivers can be unfair as some receivers may experience high losses while others find that their full reception capacity is not fully utilized. Layered video multicast protocols have been developed to address this intra-session fairness problem. In such protocols (e.g., RLM and LVMR), video is multicast in multiple layers over separate multicast groups. Receivers join as many layers as they can handle. While protocols such as RLM and LVMR have been shown to successfully address the intra-session fairness issue in video multicast, they do not address the issue of inter-session fairness, i.e., fair sharing between multiple video sessions and TCP sessions. In this paper, we demonstrate and develop insight into the inter-session fairness problem through a set of simulation experiments. We then propose a novel idea to improve inter- session fairness for layered video multicast protocol: layer-based congestion sensitivity mechanism, and finally evaluate it with some other end-to-end schemes designed to promote inter-session fairness when used to augment layered video multicast protocols.
Thrifty scheduling is an algorithm that improves the responsiveness of a stripe-scheduled multimedia server. It increases the determination of the data-distribution service, reduces the likelihood of high startup delays, and enables an increase in the rated load of the system. A stripe-scheduled media server is a distributed video-on- demand system that load-balances by striping video data across multiple computer nodes and cyclically scheduling the distribution of the data. The server displays highly variable startup delays in response to requests for data streams. These delays are due to clusters of allocated slots in the distribution schedule, which form naturally as the system load increases. Thrifty scheduling is a scalable algorithm that improves responsiveness by allocating streams to schedule slots in a way that reduces the clustering in the schedule. This algorithm has been incorporated into the Tiger video fileserver.
Patching has been shown to be cost efficient for video-on- demand systems. Unlike conventional multicast, patching is a dynamic multicast scheme which enables a new request to join an ongoing multicast. Since a multicast can now grow dynamically to serve new users, this approach is more efficiency than traditional multicast. In addition, since a new request can be serviced immediately without having to wait for the next multicast, true video-on-demand can be achieved. In this paper, we introduce the notion of patching window, and present a generalized patching method. We show that existing schemes are special cases with a specific patching window size. We derive a mathematical formula to help determine the optimal size for the patching window. This formula allows us to design the best patching scheme given a workload. The proposed technique is validated using simulations. They show that the analytical results are very accurate. We also provide performance results to demonstrate that the optimal technique outperforms the existing schemes by a significant margin. It is also up to two times better than the best Piggybacking method which provides data sharing by merging the services in progress into a single stream by altering their display rates.
In this paper, we propose a novel mechanism for providing deterministic service for Variable Bit Rate (VBR) streams at the disk. Previous approaches have relied on peak-rate of the stream for admission control for providing deterministic service. The proposed scheme allows statistical multiplexing of VBR streams at the storage system while providing deterministic service guarantees. We show that the proposed scheme can significantly improve stream throughput compared to peak-rate based schemes. We also evaluate the impact of other strategies such as data smoothing, statistical guarantees and higher stream startup latencies for improving the throughput of VBR streams. We show that stream startup latency can be effectively traded off for improving the stream throughput. We also show that smoothing and statistical guarantees do not provide significant additional improvements in stream throughput beyond the proposed approach.
In this paper we describe collaborative work between two projects at the University of Reading. We define Quality of Perception (QoP) as representing the user side of the more technical and traditional Quality of Service (QoS). QoP is a term which encompasses not only a user's satisfaction with the quality of multimedia presentations, but also his/her ability to analyze, synthesize and assimilate the informational content of multimedia displays. The Dynamically Reconfigurable Protocol Stacks (DRoPS) project addresses issues of runtime reconfigurable transport systems. Network architectures that do not support resource reservation are unable to guarantee fundamental connection characteristics such as delay, jitter, throughput, loss and bit error rates. The DRoPS architecture supports low cost reconfiguration of individual protocol mechanisms in an attempt to best maintain QoP in connections where the provided QoS fluctuates unpredictably.
In this paper we present a method for sharing collaboration- unaware VRML content, e.g. 3D models which were not specifically developed for use in a distributed environment. This functionality is an essential requirement for the inclusion of arbitrary VRML content, as generated by standard CAD or animation software, into teleconferencing sessions. We have developed a 3D TeleCooperation (TeCo3D) prototype to demonstrate the feasibility of our approach. The basic services provided by the prototype are the distribution of cooperation unaware VRML content, the sharing of user interactions, and the joint viewing of the content. In order to achieve maximum portability, the prototype was developed completely in Java. This paper presents general aspects of sharing VRML content as well as the concepts, the architecture and the services of the TeCo3D prototype. Our approach relies on existing VRML browsers as the VRML presentation and execution engines while reliable multicast is used as the means of communication to provide for scalability.
Video effects play an important role in adding production value to video progress. The use of video effects with Internet Video sources, however, is still uncommon because traditional hardware-based solutions are poorly suited to the Internet environment. In previous work, we describe a parallel, software-only video effects system designed for Internet Video and explored the use of temporal parallelism. This paper explores the use of spatial parallelism. In particular, an intermediate semicompressed video format is described that was designed to exploit spatial parallelism, and performance measurements are reported on the use of this representation.
This paper presents a new approach for constructing libraries for building processing-intensive multimedia software. Such software is currently constructed either by using high-level libraries or by writing it `from scratch' using C. We have found that the first approach produces inefficient code, while the second approach is time- consuming and produces complex code that is difficult to maintain or reuse. We therefore designed and implemented Dali, a set of reusable, high-performance primitives and abstractions that are at an intermediate level of abstraction between C and conventional libraries. By decomposing common multimedia data types and operations into thin abstractions and primitives, programs written using Dali achieve performance competitive with hand-tuned C code, but are shorter and more reusable. Furthermore, Dali programs can employ optimizations that are difficult to exploit in C (because the code is so verbose) and impossible using conventional libraries (because the abstractions are too thick). We discuss the design of Dali, show several example programs written using Dali, and show that programs written in Dali achieve performance competitive to hand- tuned C programs.
Huge amounts of digital video information appear on the Web. Browsing becomes a predominant issue since (1) the result lists to be expected from video search engines on the Web are likely to be even less accurate than the ones we endure today in search of HTML pages; (2) the time for inspecting a single URL is expected to be even more time-consuming since we have to view time-based information i.e. viewing-over- time is the standard way of inspection, (3) the bandwidth wasted by viewing non-pertinent information tends to be by orders of magnitude larger than for HTML text. We present an approach to fast visual perception of video concepts. Our flexible hierarchical 3D representation is based on an automatic ranking and color evaluation scheme. Our Java implementation generates a set of dynamic VRML2 scenes which can be viewed using VRML2 browsers available for virtually every platform. The ranking is calculated in the compressed domain which results in very high performance for the video indexing portion of the system.
In this paper, we introduce a priority-based technique for the delivery of compressed prerecorded video streams across best-effort networks. This technique uses a multi-level priority queue in conjunction with a delivery window to help smooth the video frame rate delivered to the end user while allowing it to easily adapt to changing network conditions. Compared with current approaches, our priority-based approach has several advantages. First, it acts more globally by ensuring that a minimum frame rate for the window interval has been delivered before sending enhancement layers. Second, this approach is much simpler to implement than other frame smoothing algorithms that have been presented for the delivery of stored video across best- effort networks. Finally, this approach is directly applicable for the shaping of MPEG-based video encodings with frame dependencies.
Systems for on-demand delivery of large, widely-shared data can use several techniques to improve cost/performance, including: multicast data delivery, segmented data delivery, and regional (or proxy) servers that cache some of the data close to the clients. This paper makes three contributions to the state-of-the-art design of such systems. First, we show how segmented multicast delivery techniques, in particular the recently proposed high-performance dynamic skyscraper scheme, can be modified to allow each object to be partially or fully cached at regional servers. The new partitioned delivery architecture supports shared delivery between the regional and remote servers and improves performance even if one server delivers the entire object. The second contribution is an analytic model that can be solved to determine the full/partial object caching strategy that minimizes delivery cost in the context of a system that has homogeneous regional servers. Finally, results in the paper illustrate the use of the model and provide insight into how the optimal caching strategy is influenced by key system and workload parameters, including client request rate, the relative severity of the disk bandwidth and storage capacity constraints at the regional servers, and the relative costs of regional and remote delivery. Two important conclusions from the results are: (1) it is often cost-effective to cache the initial segments of many data objects rather than the complete data for fewer objects, and (2) the partitioned delivery architecture and caching partial objects can each greatly reduce delivery cost.
Broadcasting protocols can improve the efficiency of video on demand services by reducing the bandwidth required to transmit videos that are simultaneously watched by many viewers. It has been recently shown that broadcasting protocols using a very large number of very low bandwidth streams for each video required less total bandwidth than protocols using a few high-bandwidth streams shared by all videos. We present a hybrid broadcasting protocol that combines the advantages of these two classes of protocols. Our pagoda broadcasting protocol uses only a small number of high-bandwidth streams and requires only slightly more bandwidth than the best extant protocols to achieve a given maximum waiting time.