In this paper, we explore end-to-end loss differentiation algorithms(LDAs) for use with congestion-sensitive video transport protocols for networks with either backbone or last-hop wireless links. As our basic video transport protocol, we use UDP in conjunction with a congestion control mechanism extended with an LDA. For congestion control, we use the TCP-Friendly Rate Control (TFRC) algorithm. We extend TFRC to use an LDA when a connection uses at least one wireless link in the path between the sender and receiver. One goal of this paper is to evaluate various LDAs under different wireless network topologies and competing traffic. A second goal of this paper is to propose and evaluate a new LDA, called ZigZag, as well as a class of hybrid algorithms based upon ZigZag. We then evaluate these LDAs via simulation. Based upon our simulation results, we find that no single base algorithm performs well across all topologies and competition. However, the hybrid algorithms perform well across topologies, competition, and in some cases match or exceed the performance of the best base LDA for a given scenario.
Rether was originally developed to support guaranteed Quality of Service (QoS) for shared Ethernet LANs. With the growing popularity of wireless LANs, we modified the Rether protocol to provide QoS guarantee on wireless networks. In this paper, we present the design and implementation of the Wireless Rether protocol for 802.11 networks. We also describe our experiences with wireless LAN hardware. Wireless Rether supports QoS for TCP and UDP traffic in both upstream and downstream directions. The protocol can seamlessly inter-operate with any priority-based QoS mechanisms (such as Diffserv on the wired networks that connect the wireless access network to the rest of the Internet. QoS requirements of real-time applications are specified as a simple configurable policy table. Legacy networking applications can benefit from QoS guarantees provided by Wireless Rether without requiring any modifications.
Mobile devices/clients vary in their hardware and software resources such as screen size, color depth, effective bandwidth, processing power, and the ability to handle different data formats. This makes it difficult for servers to support a wide range of client variations. In our research, we introduce an image transcoding proxy server placed between the generic WWW servers and the heterogeneous mobile hosts. The selection of transcoding operations is based on the mobile device characteristics and can adapt to dynamically changing bandwidth on the proxy-client link. The policy decisions are used to select the most appropriate operations in order to achieve the smallest file size and acceptable image quality. More aggressive transcoding operations result in greater loss of image quality. Based on a user survey, we determined minimal acceptable thresholds for each transcoding operation. As our results show, these thresholds are dependent on the display quality of the user device.
Real-time applications that utilize multiple system resources, such as CPU, disks, and network links, require coordinated scheduling of these resources in order to meet their end-to-end performance requirements. Most state-of-the-art operating systems support independent resource allocation and deadline-driven scheduling but lack coordination among multiple heterogeneous resources. This paper describes the design and implementation of an Integrated Real-time Resource Scheduler (IRS) that performs coordinated allocation and scheduling of multiple heterogeneous resources on the same machine for periodic soft real-time application. The principal feature of IRS is a heuristic multi-resource allocation algorithm that reserves multiple resources for real-time applications in a manner that can maximize the number of applications admitted into the system in the long run. At run-time, a global scheduler dispatches the tasks of the soft real-time application to individual resource schedulers according to the precedence constraints between tasks. The individual resource schedulers, which could be any deadline based schedulers, can make scheduling decisions locally and yet collectively satisfy a real-time application's performance requirements. The tightness of overall timing guarantees is ultimately determined by the properties of individual resource schedulers. However, IRS maximizes overall system resource utilization efficiency by coordinating deadline assignment across multiple tasks in a soft real-time application.
Algorithms for allocating CPU bandwidth to soft real-time processes exist, yet best-effort scheduling remains an attractive model for both application developers and users. Best-effort scheduling is easy to use, provides a reasonable trade-off between fairness and responsiveness, and imposes no extra overhead for specifying resource demands. However, best-effort schedulers provide no resource guarantees, limiting their ability to support processes with timeliness constraints. Reacting to the need for better support of soft real-time multimedia applications while recognizing that the best-effort model permeates desktop computing for very good reasons, we have developed BEST, an enhanced best-effort scheduler that combines desirable aspects of both types of computing. BEST provides the well-behaved default characteristics of best-effort schedulers while significantly improving support for periodic soft real-time processes. BEST schedules using estimated deadlines based on the dynamically detected periods of processes exhibiting periodic behavior, and assigns pseudo-periods to non-periodic processes to allow for good response time. This paper discusses the BEST scheduling model and our implementation in Linux and presents results demonstrating that BEST outperforms the Linux scheduler in handling soft real-time processes, outperforms real-time schedulers in handling best-effort processes, and sometimes outperforms both, especially in situations of processor overload.
In contrast to classical assumptions in Video on Demand (VoD) research, the main requirements for VoD in the Internet are adaptiveness, support of heterogeneity, and last not least high scalability. Hierarchically layered video encoding is particularly well suited to deal with adaptiveness and heterogeneity support for video streaming. A distributed caching architecture is key to a scalable VoD solution in the Internet. Thus, the combination of caching and layered video streaming is promising for an Internet VoD system, yet, requires thoughts about some new issues and challenges, e.g., how to keep layered transmissions TCP-friendly. In this paper, we investigate one particular of these issues: how can a TCP-friendly transmission exploit its fair share of network resources taking into account that the constrained granularity of layer encoded video inhibits an exact adaptation to actual transmission rates. We present a new technique that makes use of retransmissions of missing segments for a cached layered video to claim the fair share within a TCP-friendly session. Based on simulative experiments the potential and applicability of the technique, which we also call fair share claiming is shown. Moreover, a design for the integration of fair share claiming in streaming applications which are supported by caching is devised.
Long battery life and high performance multimedia decoding are competing design goals for portable appliances. For a target level of QoS, the achievable battery life can be increased by dynamically adjusting the supply voltage throughout execution. In this paper, an efficient offline scheduling algorithm is proposed for preprocessing stored MPEG audio and video streams. It computes the order and voltage settings at which the appliance's CPU decodes the frames, reducing energy consumption without violating timing or buffering constraints. Our experimental results elucidate the tradeoff of QoS and energy consumption. They demonstrate that the scheduler reduces CPU energy consumption by 19%, without any sacrifice of quality, and by nearly 50%, with only slightly reduced quality. The results also explore how the QoS/energy tradeoff is affected by buffering and processor speed.
With the proliferation of mobile streaming multimedia, available battery capacity constrains the end-user experience. Since streaming applications tend to be long running, wireless network interface card's (WNIC) energy consumption is particularly an acute problem. In this work, we explore the WNIC energy consumption implications of popular multimedia streaming formats from Microsoft (Windows media), Real (Real media) and Apple (Quick Time). We investigate the energy consumption under varying stream bandwidth and network loss rates. We also explore history-based client-side strategies to reduce the energy consumed by transitioning the WNICs to a lower power consuming sleep state. We show that Microsoft media tends to transmit packets at regular intervals; streams optimized for 28.8 Kbps can save over 80% in energy consumption with 2% data loss. A high bandwidth stream (768 Kbps) can still save 57% in energy consumption with less than 0.3% data loss. For high bandwidth streams, Microsoft media exploits network-level packet fragmentation, which can lead to excessive packet loss (and wasted energy) in a lossy network. Real stream packets tend to be sent closer to each other, especially at higher bandwidths. Quicktime packets sometimes arrive in quick succession; most likely an application level fragmentation mechanism. Such packets are harder to predict at the network level without understanding the packet semantics.
Applications running on a mobile and wireless device most be able to adapt gracefully to limited and fluctuating network resources. When multiple applications run on the same device, control over bandwidth scheduling becomes fundamental in order to coordinate the adaptation of the multiple applications. This paper presents the design and evaluation of our Hierarchical Adaptive Transmission Scheduler (HATS), which adds control over bandwidth scheduling to the Puppeteer adaptation system. HATS enables the implementation of system-wide adaptation policies that can drastically improve the user's perception of network performance.
Stream merging is a technique for efficiently delivering popular media-on-demand using multicast and client buffers. The basic idea is that clients may simultaneously receive more data than they need for playback and store parts of the transmission in their buffers to be played back later. It was shown that with these additional capabilities, the bandwidth requirements for servers are dramatically reduced compared with traditional unicast systems and multicast systems that use only batching. Recently, several algorithms for stream merging have been proposed, and we perform a comprehensive comparison in this paper. We present the differences in philosophy and mechanics among the various algorithms, and illustrate the trade-offs between their system complexity and performance. We measure performance in average, maximum, and time-varying server bandwidth usage under different assumptions for the client request patterns. The result of this study is a deeper understanding of the system complexity and performance trade-offs for the various algorithms.
Provision of Video-on-Demand (VoD) services may require high network bandwidth and server capacity for sustained periods of time. Aggregation schemes can be used to increase the supported customer population under the constraints of these resources, but due to system behavioral complexity, these schemes have been difficult to model in the large scale. We present in this paper a time-domain analysis modeling technique that yields satisfactory performance estimation results. We show that our analysis is analogous to finding the time response of linear systems using the well known convolution theorem. We model two aggregation schemes using this technique: (1) batching by time-out and (2) adaptive piggybacking employing Snapshot-RSMA. Both schemes are server based aggregation schemes, and their importance increases as pervasive computing, with possibly many capacity-limited devices (such as PDAs) connected as potential recipients of streaming VoD, becomes closer to reality. Since such schemes do not assume any end client capacity requirements, they are the logical candidates for enabling efficient VoD service in a pervasive computing environment. We delineate the requirements of the VoD system under which this modeling technique can be employed and also propose a sampling methodology which gives good estimation results when modeling becomes too complex mathematically.
Two key technologies enabling scalable on-demand delivery of stored multimedia content are work-ahead smoothing and multicast delivery. Work-ahead smoothing reduces the burstiness of variable bit rate streams, simplifying server and network resource allocation. Recent multicast delivery techniques such as patching or bandwidth skimming serve clients that request the same content close together in time with (partially) shared multicasts, greatly reducing required server bandwidth. Although previous studies of work-ahead smoothing have generally assumed very limited client buffer space, in a number of contexts of current interest (such as systems that have significant settop storage), it becomes feasible to fully smooth variable bit rate content. We quantify the start-up delay and settop storage requirements of full smoothing for a number of sample variable bit rate objects. We then evaluate a fundamental conflict between aggressive smoothing and the new multicast delivery techniques. Work-ahead smoothing requires sending data for high rate portions of an object earlier than it is needed for playback, while multicast techniques yield their greatest benefit when data is delivered within each stream as late as possible so that more clients can share reception of that data. A new multicast delivery technique is proposed that can accommodate aggressive smoothing with increased efficiency in comparison to previous techniques, particularly for high request rates.
The popularity of peer-to-peer multimedia file sharing applications such as Gnutella and Napster has created a flurry of recent research activity into peer-to-peer architectures. We believe that the proper evaluation of a peer-to-peer system must take into account the characteristics of the peers that choose to participate. Surprisingly, however, few of the peer-to-peer architectures currently being developed are evaluated with respect to such considerations. In this paper, we remedy this situation by performing a detailed measurement study of the two popular peer-to-peer file sharing systems, namely Napster and Gnutella. In particular, our measurement study seeks to precisely characterize the population of end-user hosts that participate in these two systems. This characterization includes the bottleneck bandwidths between these hosts and the Internet at large, IP-level latencies to send packets to these hosts, how often hosts connect and disconnect from the system, how many files hosts share and download, the degree of cooperation between the hosts, and several correlations between these characteristics. Our measurements show that there is significant heterogeneity and lack of cooperation across peers participating in these systems.
With the deployment of multimedia service proxies at different locations in the networks, it is possible to create an application-level media service proxy network. Multimedia sources and clients will then be able to connect to this network, and create customized, value-added, and composite media service delivered by one or more proxies in the media service proxy network. In this paper, we focus on the problem of finding multimedia service path in the media service proxy network. Our goal is to find the best path with respect to end-to-end resource availability for each service path request. Our solution includes (1) a mechanism to monitor and propagate resource availability information in the media service proxy network and (2) an algorithm to find the best service path based on the resource monitoring results. Its main features include: (1) the resource monitoring mechanism provides reasonable accuracy and stability, while incurring controlled overhead; (2) the service path finding algorithm finds the best path for each service path request, and achieves high overall success rate among all requests; and (3) it is an application-level solution, and does not require changes to the lower-level network infrastructure.
With the explosive growth of video applications over the Internet, many approaches have been proposed to stream video effectively over packet switched, best-effort networks. A number of these use techniques from source and channel coding, or implement transport protocols, or modify system architectures in order to deal with delay, loss, and time-varying nature of the Internet. In this paper, we propose a framework for streaming video from multiple senders simultaneously to a single receiver. The main motivation in doing so is to exploit path diversity in order to achieve higher throughput, and to increase tolerance to packet loss and delay due to network congestion. In this framework, we propose a receiver-driven transport protocol to coordinate simultaneous transmissions of video from multiple senders. Our protocol employs two algorithms: rate allocation and packet partition. The rate allocation algorithm determines sending rate for each sender to minimize the packet loss, while the packet partition algorithm minimizes the probability of packets arriving late. Using NS and actual Internet experiments, we demonstrate the effectiveness of our proposed distributed transport protocol in terms of the overall packet loss rate, and compare its performance against a na*ve distributed protocol.
We present a temporal debugger, capable of examining time flow of applications in general-purpose computer systems. The debugger is attached to a complete system simulator, which models an entire workstation in sufficient detail to run commodity operating systems and workloads. Unlike traditional debuggers, a debugger operating on a simulated system does not disturb the timing of the target program, allowing reproducible experiments and large amounts of instrumentation and monitoring without intrusion. We have implemented the temporal debugger by modifying the GNU debugger to operate on applications in a simulated Linux system. Debugger implementation is difficult because the debugger expects application-related data, whereas the simulator provides low-level data. We introduce a technique, virtual machine translation, for mapping simulator data to the debugger by parsing operating system data structures in the simulated system. The debugger environment allows collection of performance statistics from multiple abstraction levels: hardware, operating system, and application level. We show how this data can be used to profile quality of service performance of a video decoder. The debugger is used to detect display jitter, and by correlating runtime statistics to image rendering time, we expose deviations when the application is unable to render an image in time, thereby locating the cause of the display jitter.
This paper presents a system designed to automate the production of webcasts, the Virtual Director. It automates simple tasks such as control of recording equipment, stream broadcasting, and camera control. It also automates content decisions, such as which camera view to broadcast. Directors can specify the content decisions using an automation specification language. The Virtual Director also uses a question monitor service to automatically identify questions and move the cameras to show the audience member asking the question. We discuss the implementation of the Virtual Director and present the results of its use in the production of a university seminar series.
This paper explores issues involved in a web-based query-by-humming system, which can find a piece of music in the digital music repository based on hummed melodies. Melody representation, melody matching, melody extraction and query construction are critical for an efficient and robust query-by-humming system and thus the focuses of this paper. Compared to previous systems, a new and more effective melody representation and corresponding melody matching methods which combined both pitch and rhythmic information were adopted, a whole set of tools and deliverable software were implemented, and experiments were conducted to evaluate the system. The experimental results demonstrate that our methods are more effective for most users than other existing methods.
Key to the utility of multicast is the correct operation of the various protocols. More importantly, these protocols must operate infrastructure-wide. Infrastructure-wide monitoring is necessary for effectively monitoring protocol operation. Consequently, monitoring is needed for efficient management and troubleshooting of multicast networks. In this paper we present results based on the analysis and comparison of multicast routing and source announcement data collected by our global monitoring system. This system, called Mantra, collects data from various important locations in the Internet multicast topology. Our final conclusion, and not one that can be drawn by looking at any individual router, is that there exists a fair amount of inconsistency between routers in what should be consistent global state.