Future HPC and datacenter systems are expected to contain an increasingly heterogeneous set of compute and memory resources as a strategy to preserve performance scaling in the long term. This heterogeneity, combined with the low average resource usage of today’s systems, motivate resource disaggregation to allow applications to pool and compose no more than the fine-grain resources they require. In this paper, we start by motivating resource disaggregation by observing average utilization and rate of change of memory bandwidth and latency in NERSC’s Cori. We then perform an analytical analysis that quantifies if today’s photonic links and switches would meet key metrics to minimize overhead, and what that overhead will be. Finally, we perform preliminary experiments to demonstrate that even this minimal overhead penalizes application performance in some cases. Our study motivates future work on more aggressive photonic links and switches in order to make resource disaggregation more attractive in future systems.
|