Video servers need to assign a fixed set of resources to each video stream in order to guarantee on-time delivery of the video data. If a server has insufficient resources to guarantee the delivery, it must reject the stream request rather than slowing down all existing streams. Large scale video servers are being built as clusters of smaller components, so as to be economical, scalable, and highly available. This paper uses a blocking model developed for telephone systems to evaluate video server cluster topologies. The goal is to achieve high utilization of the components and low per-stream cost combined with low blocking probability and high user satisfaction. The analysis shows substantial economies of scale achieved by larger server images. Simple distributed server architectures can result in partitioning of resources with low achievable resource utilization. By comparing achievable resource utilization of partitioned and monolithic servers, we quantify the cost of partitioning. Next, we present an architecture for a distributed server system that avoids resource partitioning and results in highly efficient server clusters. Finally, we show how, in these server clusters, further optimizations can be achieved through caching and batching of video streams.