27 July 2001 Experience with ADI-FDTD techniques on the Cray MTA supercomputer
Author Affiliations +
Proceedings Volume 4528, Commercial Applications for High-Performance Computing; (2001) https://doi.org/10.1117/12.434878
Event: ITCom 2001: International Symposium on the Convergence of IT and Communications, 2001, Denver, CO, United States
Abstract
Finite difference, time domain (FDTD) simulations are important to the design cycle for optical communications devices. High spatial resolution is essential, and the Courant condition limits the time step, making this problem require the level of high-performance system usually only available at a remote center. Model definition and result visualization can be done locally. Recent application of the alternating direction implicit (ADI) method to FDTD removes the Courant condition, promising larger time steps for meaningful turnaround in simulations. At each time step, tridiagonal equations are solved over single dimensions of a 3D problem, but all three dimensions are involved in each time step. Thus, for a distributed memory multiprocessor, no partition of the data prevents tridiagonals from crossing processors without remapping every time step. Likewise, for cache based or vector computers, there is a stride of NxN for tridiagonals at every time step for a NxNxN grid. There is plenty of parallelism, because NxN tridiagonals can be solved simultaneously. This makes the problem well suited to a machine like the Cray multithreaded architecture (MTA) that has a large, flat memory and uses parallelism to hide memory latency. A Cray MTA implementation of the ADI-FDTD code executes serial tridiagonal solvers in parallel on multiple threads and successfully hides memory latency, achieving just over one FLOP per clock cycle per processor for a 200x200x200 grid on an 8 processor system at the San Diego Supercomputer Center. The 8 processor speed is 2.06 Gflop and the efficiency is 98%. Comparing one MTA processor, with a 250 MHz clock to a 500 MHz Alpha processor, the MTA is three times as fast for a 50x50x50 grid problem size. A vectorized version of the code run on one Cray T90 processor is three times faster than one MTA processor for a 100x100x100 grid size.
© (2001) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Harry F. Jordan, Harry F. Jordan, Shahid Bokhari, Shahid Bokhari, Shawn Staker, Shawn Staker, Jon R. Sauer, Jon R. Sauer, Mona A. ElHelbawy, Mona A. ElHelbawy, Melinda J. Piket-May, Melinda J. Piket-May, } "Experience with ADI-FDTD techniques on the Cray MTA supercomputer", Proc. SPIE 4528, Commercial Applications for High-Performance Computing, (27 July 2001); doi: 10.1117/12.434878; https://doi.org/10.1117/12.434878
PROCEEDINGS
9 PAGES


SHARE
Back to Top