Translator Disclaimer
25 January 2011 GPU color space conversion
Author Affiliations +
Proceedings Volume 7872, Parallel Processing for Imaging Applications; 78720D (2011)
Event: IS&T/SPIE Electronic Imaging, 2011, San Francisco Airport, California, United States
Tetrahedral interpolation is commonly used to implement continuous color space conversions from sparse 3D and 4D lookup tables. We investigate the implementation and optimization of tetrahedral interpolation algorithms for GPUs, and compare to the best known CPU implementations as well as to a well known GPU-based trilinear implementation. We show that a $500 NVIDIA GTX-580 GPU is 3x faster than a $1000 Intel Core i7 980X CPU for 3D interpolation, and 9x faster for 4D interpolation. Performance-relevant GPU attributes are explored including thread scheduling, local memory characteristics, global memory hierarchy, and cache behaviors. We consider existing tetrahedral interpolation algorithms and tune based on the structure and branching capabilities of current GPUs. Global memory performance is improved by reordering and expanding the lookup table to ensure optimal access behaviors. Per multiprocessor local memory is exploited to implement optimally coalesced global memory accesses, and local memory addressing is optimized to minimize bank conflicts. We explore the impacts of lookup table density upon computation and memory access costs. Also presented are CPU-based 3D and 4D interpolators, using SSE vector operations that are faster than any previously published solution.
© (2011) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Patrick Chase and Gary Vondran "GPU color space conversion", Proc. SPIE 7872, Parallel Processing for Imaging Applications, 78720D (25 January 2011);

Back to Top