In this paper, we consider the delivery of layered video from parallel heterogenous servers within a video-on-demand
infrastruture. A parallel server architecture enables the service of requests by more than one server, thus
reducing load at individual servers and dispersing network load. Serving requests for a single video through all
or a subset of servers in the system reduces the probability of server overload brought about by a large number of
requests for popular content; more clients may also be admitted for the retrieval of video data. Delivery through
multiple servers requires that the video data be partitioned. Ideally, the data should be partitioned such that
multiple server retrieval provides the same download and access time performance possible when retrieving from
a single server of the same total bandwidth. We design and analyse play-while-retrieve strategies that involve
streaming layers from different servers and show how access time can be reduced through these strategies. While
system wide data striping can completely remove the problem of hotspotting, the method does not scale well
and problems may be encountered when the system grow in size or when heterogenous disks have to be used.
Since our proposed scheme takes into consideration heterogenous upload bandwidth and layer bitrates, it may
be suitable for a peer to peer network where peer upload bandwidth is limited and varied.